INTERNATIONAL GONGRESS 75 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z">c(m,Ek(χ,1))=∑a∣m(m/a,n)=1χ(m/a)Nak−1
To describe the constant coefficients of E k ( χ , 1 ) E k ( χ , 1 ) E_(k)(chi,1)E_{k}(\chi, 1)Ek(χ,1) we first set S S SSS to be minimal, i.e., the union of S S ∞ S_(oo)S_{\infty}S∞ and S ram S ram  S_("ram ")S_{\text {ram }}Sram , where the latter is the set of primes dividing n n n\mathfrak{n}n. Next we assume for the remainder of the article that n 1 n ≠ 1 n!=1\mathfrak{n} \neq 1n≠1; the case n = 1 n = 1 n=1\mathfrak{n}=1n=1 causes no difficulties but the formulas must be slightly modified. We then have
c λ ( 0 , E 1 ( χ , 1 ) ) = { 0 , k > 1 2 n Θ S # , k = 1 c λ 0 , E 1 ( χ , 1 ) = 0 , k > 1 2 − n Θ S # , k = 1 c_(lambda)(0,E_(1)(chi,1))={[0",",k > 1],[2^(-n)Theta_(S)^(#)",",k=1]:}c_{\lambda}\left(0, E_{1}(\chi, 1)\right)= \begin{cases}0, & k>1 \\ 2^{-n} \Theta_{S}^{\#}, & k=1\end{cases}cλ(0,E1(χ,1))={0,k>12−nΘS#,k=1
Here Θ S = Θ S , ϕ ( H / F , 0 ) Q [ G ] Θ S = Θ S , Ï• ( H / F , 0 ) ∈ Q [ G ] Theta_(S)=Theta_(S,phi)(H//F,0)inQ[G]\Theta_{S}=\Theta_{S, \phi}(H / F, 0) \in \mathbf{Q}[G]ΘS=ΘS,Ï•(H/F,0)∈Q[G] denotes the S S SSS-depleted but unsmoothed Stickelberger element. We have E k ( χ , 1 ) M k ( n , χ ; R p ) E k ( χ , 1 ) ∈ M k n , χ ; R p E_(k)(chi,1)inM_(k)(n,chi;R_(p))E_{k}(\chi, 1) \in M_{k}\left(\mathfrak{n}, \chi ; R_{p}\right)Ek(χ,1)∈Mk(n,χ;Rp) for k > 1 k > 1 k > 1k>1k>1 and E 1 ( χ , 1 ) M 1 ( n , χ ; Frac ( R p ) ) E 1 ( χ , 1 ) ∈ M 1 n , χ ; Frac ⁡ R p E_(1)(chi,1)inM_(1)(n,chi;Frac(R_(p)))E_{1}(\chi, 1) \in M_{1}\left(\mathfrak{n}, \chi ; \operatorname{Frac}\left(R_{p}\right)\right)E1(χ,1)∈M1(n,χ;Frac⁡(Rp)) because of the possible nonintegrality of the constant term.

6.4. Eisenstein series to cusp forms

In order to define a cusp form from the Eisenstein series, one is led to consider certain linear combinations of the analogues of E k ( χ , 1 ) E k ( χ , 1 ) E_(k)(chi,1)E_{k}(\chi, 1)Ek(χ,1) as H H HHH ranges over all its CM subfields containing F F FFF. This process also incorporates smoothing at the primes in T T TTT. We avoid stating the slightly complicated formula here (see [17, PRoPoSITION 8.14]), but the end result is a group ring form W k ( χ , 1 ) W k ( χ , 1 ) W_(k)(chi,1)W_{k}(\chi, 1)Wk(χ,1) whose constant terms are given by
(6.12) c λ ( 0 , W k ( χ , 1 ) ) = { 0 , k > 1 2 n Θ # , k = 1 (6.12) c λ 0 , W k ( χ , 1 ) = 0 , k > 1 2 − n Θ # , k = 1 {:(6.12)c_(lambda)(0,W_(k)(chi,1))={[0",",k > 1],[2^(-n)Theta^(#)",",k=1]:}:}c_{\lambda}\left(0, W_{k}(\chi, 1)\right)= \begin{cases}0, & k>1 \tag{6.12}\\ 2^{-n} \Theta^{\#}, & k=1\end{cases}(6.12)cλ(0,Wk(χ,1))={0,k>12−nΘ#,k=1
where we remind the reader that Θ # = Θ Σ , Σ # Θ # = Θ Σ , Σ ′ # Theta^(#)=Theta_(Sigma,Sigma^('))^(#)\Theta^{\#}=\Theta_{\Sigma, \Sigma^{\prime}}^{\#}Θ#=ΘΣ,Σ′#. Building off the computations of [19], we calculate in [ 17 , $ 8 ] [ 17 , $ 8 ] [17,$8][17, \$ 8][17,$8] the constant terms of the W k ( χ , 1 ) W k ( χ , 1 ) W_(k)(chi,1)W_{k}(\chi, 1)Wk(χ,1) at all cusps; the terms in (6.12) can be viewed as the constant terms "at infinity." Indeed, it is the attempt to cancel the constant terms at other cusps that leads naturally to the definition of the W k ( χ , 1 ) W k ( χ , 1 ) W_(k)(chi,1)W_{k}(\chi, 1)Wk(χ,1).
In order to define a cusp form, we apply two important results of Silliman [43]. The first of these generalizes a result of Hida and Wiles and is stated below.
Theorem 6.2 ([43, THEOREM 10.7]). Let m m mmm be a fixed positive integer. For positive integers k 0 ( mod ( p 1 ) p N ) k ≡ 0 mod ( p − 1 ) p N k-=0(mod(p-1)p^(N))k \equiv 0\left(\bmod (p-1) p^{N}\right)k≡0(mod(p−1)pN) with N N NNN sufficiently large, there is a Hilbert modular form V k V k V_(k)V_{k}Vk of level 1, trivial nebentypus, and weight k k kkk defined over Z p Z p Z_(p)\mathbf{Z}_{p}Zp such that
V k 1 ( mod p m ) V k ≡ 1 mod p m V_(k)-=1(modp^(m))V_{k} \equiv 1\left(\bmod p^{m}\right)Vk≡1(modpm)
and such that the normalized constant term of V k V k V_(k)V_{k}Vk at every cusp is congruent to 1 ( mod p m ) 1 mod p m 1(modp^(m))1\left(\bmod p^{m}\right)1(modpm).
The idea to construct a cusp form is to fix a very large integer m m mmm and to consider the product W 1 ( χ , 1 ) V k M k + 1 ( n , χ , R p ) W 1 ( χ , 1 ) V k ∈ M k + 1 n , χ , R p W_(1)(chi,1)V_(k)inM_(k+1)(n,chi,R_(p))W_{1}(\chi, 1) V_{k} \in M_{k+1}\left(\mathfrak{n}, \chi, R_{p}\right)W1(χ,1)Vk∈Mk+1(n,χ,Rp) with V k V k V_(k)V_{k}Vk as in Theorem 6.2. This series has constant terms at infinity congruent to 2 n Θ # 2 − n Θ # 2^(-n)Theta^(#)2^{-n} \Theta^{\#}2−nΘ# modulo p m p m p^(m)p^{m}pm. One then wants to subtract off 2 n Θ # H k + 1 ( χ ) 2 − n Θ # H k + 1 ( χ ) 2^(-n)Theta^(#)H_(k+1)(chi)2^{-n} \Theta^{\#} H_{k+1}(\chi)2−nΘ#Hk+1(χ) for some group ring valued form H k + 1 ( χ ) M k + 1 ( n , χ , R p ) H k + 1 ( χ ) ∈ M k + 1 n , χ , R p H_(k+1)(chi)inM_(k+1)(n,chi,R_(p))H_{k+1}(\chi) \in M_{k+1}\left(\mathfrak{n}, \chi, R_{p}\right)Hk+1(χ)∈Mk+1(n,χ,Rp) to obtain a cusp form. If there exists a prime above p p ppp dividing n n n\mathfrak{n}n (i.e., Σ S Σ − S ∞ Sigma-S_(oo)\Sigma-S_{\infty}Σ−S∞ is nonempty), then this strategy works. Silliman's second result, which generalizes a result of Chai and is stated in [43, THEOREM 10.10], implies that one can obtain a form that is cuspidal at the cusps "above infinity at p p ppp " in this fashion. Applying Hida's ordinary operator then yields a form that is cuspidal.
Theorem 6.3 ([17, theOREM 8.18]). Suppose gcd ( n , p ) 1 gcd ⁡ ( n , p ) ≠ 1 gcd(n,p)!=1\operatorname{gcd}(\mathfrak{n}, p) \neq 1gcd⁡(n,p)≠1. For positive integers k 1 k ≡ 1 k-=1k \equiv 1k≡1 ( mod ( p 1 ) p N ) mod ( p − 1 ) p N (mod(p-1)p^(N))\left(\bmod (p-1) p^{N}\right)(mod(p−1)pN) and N N NNN sufficiently large, there exists H k ( χ ) M k ( n , χ , R p ) H k ( χ ) ∈ M k n , χ , R p H_(k)(chi)inM_(k)(n,chi,R_(p))H_{k}(\chi) \in M_{k}\left(\mathfrak{n}, \chi, R_{p}\right)Hk(χ)∈Mk(n,χ,Rp) such that
F k ( χ ) = e p ord ( W 1 ( χ , 1 ) V k 1 Θ # H k ( χ ) ) F k ( χ ) = e p ord  W 1 ( χ , 1 ) V k − 1 − Θ # H k ( χ ) F_(k)(chi)=e_(p)^("ord ")(W_(1)(chi,1)V_(k-1)-Theta^(#)H_(k)(chi))F_{k}(\chi)=e_{p}^{\text {ord }}\left(W_{1}(\chi, 1) V_{k-1}-\Theta^{\#} H_{k}(\chi)\right)Fk(χ)=epord (W1(χ,1)Vk−1−Θ#Hk(χ))
lies in S k ( n p , R , χ ) S k ( n p , R , χ ) S_(k)(np,R,chi)S_{k}(\mathfrak{n} p, R, \chi)Sk(np,R,χ).
The significance of Theorem 6.3 is that we have now constructed a cusp form that is congruent to an Eisenstein series modulo Θ # Θ # Theta^(#)\Theta^{\#}Θ#.
When gcd ( n , p ) = 1 gcd ⁡ ( n , p ) = 1 gcd(n,p)=1\operatorname{gcd}(\mathfrak{n}, p)=1gcd⁡(n,p)=1, the construction of the cusp form is in fact more interesting, and a new feature appears. In this case, the ordinary operator at p p ppp does not annihilate the form W k ( χ , 1 ) W k ( χ , 1 ) W_(k)(chi,1)W_{k}(\chi, 1)Wk(χ,1), and it must be incorporated into our linear combination. Moreover, this apparent
cost has a great benefit-we obtain a congruence between a cusp form and Eisenstein series not only modulo Θ # Θ # Theta^(#)\Theta^{\#}Θ#, but modulo a multiple x Θ # x â‹… Θ # x*Theta^(#)x \cdot \Theta^{\#}x⋅Θ# for a certain x R p x ∈ R p x inR_(p)x \in R_{p}x∈Rp. This element x x xxx has an intuitive meaning-it represents the trivial zeroes of the p p ppp-adic L L LLL-function associated to χ χ chi\chiχ, even the " mod p mod p mod p\bmod pmodp trivial zeroes." The precise definition is as follows.
Lemma 6.4. Suppose gcd ( n , p ) = 1 gcd ⁡ ( n , p ) = 1 gcd(n,p)=1\operatorname{gcd}(\mathfrak{n}, p)=1gcd⁡(n,p)=1. For positive k 1 ( mod ( p 1 ) p N ) k ≡ 1 mod ( p − 1 ) p N k-=1(mod(p-1)p^(N))k \equiv 1\left(\bmod (p-1) p^{N}\right)k≡1(mod(p−1)pN) with N N NNN sufficiently large, the element
x = Θ S ( 1 k ) Θ S ( 0 ) Frac ( R ) x = Θ S ∞ ( 1 − k ) Θ S ∞ ( 0 ) ∈ Frac ⁡ ( R ) x=(Theta_(S_(oo))(1-k))/(Theta_(S_(oo))(0))in Frac(R)x=\frac{\Theta_{S_{\infty}}(1-k)}{\Theta_{S_{\infty}}(0)} \in \operatorname{Frac}(R)x=ΘS∞(1−k)ΘS∞(0)∈Frac⁡(R)
lies in R R RRR and is a non-zero-divisor.
The analogue of Theorem 6.3 for gcd ( n , p ) = 1 gcd ⁡ ( n , p ) = 1 gcd(n,p)=1\operatorname{gcd}(\mathfrak{n}, p)=1gcd⁡(n,p)=1 is as follows.
Theorem 6.5 ([17, theorem 8.17]). Suppose gcd ( n , p ) = 1 gcd ⁡ ( n , p ) = 1 gcd(n,p)=1\operatorname{gcd}(\mathfrak{n}, p)=1gcd⁡(n,p)=1. For positive integers k 1 k ≡ 1 k-=1k \equiv 1k≡1 ( mod ( p 1 ) p N ) mod ( p − 1 ) p N (mod(p-1)p^(N))\left(\bmod (p-1) p^{N}\right)(mod(p−1)pN) and N N NNN sufficiently large, there exists H k ( χ ) M k ( n , χ , R p ) H k ( χ ) ∈ M k n , χ , R p H_(k)(chi)inM_(k)(n,chi,R_(p))H_{k}(\chi) \in M_{k}\left(\mathfrak{n}, \chi, R_{p}\right)Hk(χ)∈Mk(n,χ,Rp) such that
F k ( χ ) = e p ord ( x W 1 ( χ , 1 ) V k 1 W k ( χ , 1 ) x Θ # H k ( χ ) ) F k ( χ ) = e p ord  x W 1 ( χ , 1 ) V k − 1 − W k ( χ , 1 ) − x Θ # H k ( χ ) F_(k)(chi)=e_(p)^("ord ")(xW_(1)(chi,1)V_(k-1)-W_(k)(chi,1)-xTheta^(#)H_(k)(chi))F_{k}(\chi)=e_{p}^{\text {ord }}\left(x W_{1}(\chi, 1) V_{k-1}-W_{k}(\chi, 1)-x \Theta^{\#} H_{k}(\chi)\right)Fk(χ)=epord (xW1(χ,1)Vk−1−Wk(χ,1)−xΘ#Hk(χ))
lies in S k ( π p , R , χ ) S k ( Ï€ p , R , χ ) S_(k)(pip,R,chi)S_{k}(\mathfrak{\pi} p, R, \boldsymbol{\chi})Sk(Ï€p,R,χ).
The extra factor of x x xxx in our congruence between the cusp form F k ( χ ) F k ( χ ) F_(k)(chi)F_{k}(\chi)Fk(χ) and a linear combination of Eisenstein series plays an extremely important role in showing that the Galois cohomology classes we construct are unramified at p p ppp.
We conclude this section by interpreting the congruences of Theorems 6.3 and 6.5 in terms of Hecke algebras. We consider the Hecke algebra T ~ T ~ tilde(T)\tilde{\mathbf{T}}T~ generated over R p R p R_(p)R_{p}Rp by the operators T q T q T_(q)T_{\mathfrak{q}}Tq for primes q n p q ∤ n p q∤np\mathfrak{q} \nmid \mathfrak{n} pq∤np and U p U p U_(p)U_{\mathfrak{p}}Up for primes p p p ∣ p p∣p\mathfrak{p} \mid pp∣p. (We ignore the operators U q U q U_(q)U_{\mathfrak{q}}Uq for q n , q p q ∣ n , q ∤ p q∣n,q∤p\mathfrak{q} \mid \mathfrak{n}, \mathfrak{q} \nmid pq∣n,q∤p in order to avoid issues regarding nonreducedness of Hecke algebras arising from the presence of oldforms.) We denote by T = e p ord ( T ~ ) T = e p ord  ( T ~ ) T=e_(p)^("ord ")( tilde(T))\mathbf{T}=e_{p}^{\text {ord }}(\tilde{\mathbf{T}})T=epord (T~) Hida's ordinary Hecke algebra associated to T ~ T ~ tilde(T)\tilde{\mathbf{T}}T~. Let ϵ cyc : G F Z p ϵ cyc  : G F → Z p ∗ epsilon_("cyc "):G_(F)rarrZ_(p)^(**)\epsilon_{\text {cyc }}: G_{F} \rightarrow \mathbf{Z}_{p}^{*}ϵcyc :GF→Zp∗ denote the p p ppp-adic cyclotomic character of F F FFF. Theorems 6.3 and 6.5 then yield:
Theorem 6.6. Let x = 1 x = 1 x=1x=1x=1 if gcd ( n , p ) 1 gcd ⁡ ( n , p ) ≠ 1 gcd(n,p)!=1\operatorname{gcd}(\mathfrak{n}, p) \neq 1gcd⁡(n,p)≠1 and let x x xxx be as in Lemma 6.4 if gcd ( n , p ) = 1 gcd ⁡ ( n , p ) = 1 gcd(n,p)=1\operatorname{gcd}(\mathfrak{n}, p)=1gcd⁡(n,p)=1. There exists an R p / x Θ # R p / x Θ # R_(p)//xTheta^(#)R_{p} / x \Theta^{\#}Rp/xΘ#-algebra W W WWW and a surjective R p R p R_(p)R_{p}Rp-algebra homomorphism φ : T W φ : T → W varphi:Trarr W\varphi: \mathbf{T} \rightarrow Wφ:T→W satisfying the following properties.
  • The structure map R p / x Θ # W R p / x Θ # → W R_(p)//xTheta^(#)rarr WR_{p} / x \Theta^{\#} \rightarrow WRp/xΘ#→W is an injection.
  • φ ( T Y ) = ϵ cyc k 1 ( l ) + χ ( l ) φ T Y = ϵ cyc  k − 1 ( l ) + χ ( l ) varphi(T_(Y))=epsilon_("cyc ")^(k-1)(l)+chi(l)\varphi\left(T_{\mathfrak{Y}}\right)=\epsilon_{\text {cyc }}^{k-1}(\mathfrak{l})+\chi(\mathfrak{l})φ(TY)=ϵcyc k−1(l)+χ(l) for n p n p np\mathfrak{\mathfrak { n } p}np.
  • φ ( U p ) = 1 φ U p = 1 varphi(U_(p))=1\varphi\left(U_{\mathfrak{p}}\right)=1φ(Up)=1 for p gcd ( n , p ) p ∣ gcd ⁡ ( n , p ) p∣gcd(n,p)\mathfrak{p} \mid \operatorname{gcd}(\mathfrak{n}, p)p∣gcd⁡(n,p).
  • Let
U = p p , p ł n ( U p χ ( p ) ) T U = ∏ p ∣ p , p Å‚ n   U p − χ ( p ) ∈ T U=prod_(p∣p,pÅ‚n)(U_(p)-chi(p))inTU=\prod_{\mathfrak{p} \mid p, \mathfrak{p} Å‚ \mathfrak{n}}\left(U_{\mathfrak{p}}-\chi(\mathfrak{p})\right) \in \mathbf{T}U=∏p∣p,pÅ‚n(Up−χ(p))∈T
If y R p y ∈ R p y inR_(p)y \in R_{p}y∈Rp and φ ( U ) y = 0 φ ( U ) y = 0 varphi(U)y=0\varphi(U) y=0φ(U)y=0 in W W WWW, then y ( Θ # ) y ∈ Θ # y in(Theta^(#))y \in\left(\Theta^{\#}\right)y∈(Θ#).
The idea of this theorem is the usual one: the homomorphism φ φ varphi\varphiφ sends a Hecke

point: the operators U p U p U_(p)U_{\mathfrak{p}}Up for p p , p n p ∣ p , p ∤ n p∣p,p∤n\mathfrak{p} \mid p, \mathfrak{p} \nmid \mathfrak{n}p∣p,p∤n do not act as scalars, so a more involved argument is necessary. This explains why the ring W W WWW is not just R p / x Θ # R p / x Θ # R_(p)//xTheta^(#)R_{p} / x \Theta^{\#}Rp/xΘ#. The idea behind the last statement of the theorem is that the operator φ ( U ) φ ( U ) varphi(U)\varphi(U)φ(U) introduces a factor of x x xxx; If x y x y xyx yxy is divisible by x Θ # x Θ # xTheta^(#)x \Theta^{\#}xΘ# in R p R p R_(p)R_{p}Rp, then y y yyy is divisible by Θ # Θ # Theta^(#)\Theta^{\#}Θ# since x x xxx is a non-zero-divisor. This demonstrates the essential additional ingredient provided by the "higher congruence" modulo x Θ # x Θ # xTheta^(#)x \Theta^{\#}xΘ# rather than just modulo Θ # Θ # Theta^(#)\Theta^{\#}Θ#. See [17, THEOREM 8.23] for details.

6.5. Cusp forms to Galois representations

In this section we study the Galois representation attached to cusp forms that are congruent to Eisenstein series. Let m m m\mathfrak{m}m be the intersection of the finitely many maximal ideals of T T T\mathbf{T}T containing the kernel of φ φ varphi\varphiφ. Put T m T m T_(m)\mathbf{T}_{\mathfrak{m}}Tm for the completion of T T T\mathbf{T}T with respect to m m m\mathfrak{m}m and K = Frac ( T m ) K = Frac ⁡ T m K=Frac(T_(m))K=\operatorname{Frac}\left(\mathbf{T}_{\mathfrak{m}}\right)K=Frac⁡(Tm). Then K K KKK is a finite product of fields parameterized by the Q p Q p Q_(p)\mathbf{Q}_{p}Qp-Galois orbits of cuspidal newforms of weight k k kkk and level n n n\mathfrak{n}n, defined over the ring of integers in a finite extension of Z p Z p Z_(p)\mathbf{Z}_{p}Zp, that are congruent to an Eisenstein series modulo the maximal ideal. As in [17, $9.2], the work of Hida and Wiles gives a Galois representation
ρ : G F G L 2 ( K ) ρ : G F → G L 2 ( K ) rho:G_(F)rarrGL_(2)(K)\rho: G_{F} \rightarrow \mathrm{GL}_{2}(K)ρ:GF→GL2(K)
satisfying the following:
(1) ρ ρ rho\rhoρ is unramified outside n p n p np\mathfrak{n} pnp.
(2) For all primes l n p l ∤ n p l∤np\mathfrak{l} \nmid \mathfrak{n} pl∤np, the characteristic polynomial of ρ ρ rho\rhoρ (Frob l ) l ) l)\mathfrak{l})l) is given by
(6.13) char ( ρ ( F r o b l ) ) ( x ) = x 2 T r x + χ ( l ) N I k 1 (6.13) char ⁡ ρ F r o b l ( x ) = x 2 − T r x + χ ( l ) N I k − 1 {:(6.13)char(rho(Frob_(l)))(x)=x^(2)-T_(r)x+chi(l)NI^(k-1):}\begin{equation*} \operatorname{char}\left(\rho\left(\mathrm{Frob}_{\mathfrak{l}}\right)\right)(x)=x^{2}-T_{\mathfrak{r}} x+\chi(\mathfrak{l}) \mathrm{NI}^{k-1} \tag{6.13} \end{equation*}(6.13)char⁡(ρ(Frobl))(x)=x2−Trx+χ(l)NIk−1
(3) For every q p q ∣ p q∣pq \mid pq∣p, let G q G q G_(q)G_{\mathrm{q}}Gq denote a decomposition group at q q qqq. We have
ρ | G q ( χ ε c y c k 1 η q 1 0 η q ) ρ G q ∼ χ ε c y c k − 1 η q − 1 ∗ 0 η q rho|_(G_(q))∼([chiepsi_(cyc)^(k-1)eta_(q)^(-1),**],[0,eta_(q)])\left.\rho\right|_{G_{q}} \sim\left(\begin{array}{cc} \chi \varepsilon_{\mathrm{cyc}}^{k-1} \eta_{\mathrm{q}}^{-1} & * \\ 0 & \eta_{q} \end{array}\right)ρ|Gq∼(χεcyck−1ηq−1∗0ηq)
where ε cyc ε cyc  epsi_("cyc ")\varepsilon_{\text {cyc }}εcyc  is the p p ppp-adic cyclotomic character and η Q η Q eta_(Q)\eta_{\mathrm{Q}}ηQ is an unramified character given by η q ( rec q ( ϖ 1 ) ) = U q η q rec q ⁡ Ï– − 1 = U q eta_(q)(rec_(q)(Ï–^(-1)))=U_(q)\eta_{\mathrm{q}}\left(\operatorname{rec}_{\mathrm{q}}\left(\varpi^{-1}\right)\right)=U_{\mathrm{q}}ηq(recq⁡(ϖ−1))=Uq, with ϖ Ï– Ï–\varpiÏ– a uniformizer of F q F q ∗ F_(q)^(**)F_{\mathrm{q}}^{*}Fq∗.
Let I denote the kernel of φ φ varphi\varphiφ extended to φ : T m W φ : T m → W varphi:T_(m)rarr W\varphi: \mathbf{T}_{\mathfrak{m}} \rightarrow Wφ:Tm→W. Reducing (6.13) modulo I I I\mathbf{I}I, and using ÄŒebotarev to extend from Frob b I b I b_(I)b_{\mathfrak{I}}bI to all σ G F σ ∈ G F sigma inG_(F)\sigma \in G_{F}σ∈GF, we see that the characteristic polynomial of ρ ( σ ) ρ ( σ ) rho(sigma)\rho(\sigma)ρ(σ) is congruent to ( x χ ( σ ) ) ( x ϵ c y c ( σ ) ) ( mod I ) ( x − χ ( σ ) ) x − ϵ c y c ( σ ) ( mod I ) (x-chi(sigma))(x-epsilon_(cyc)(sigma))(modI)(x-\chi(\sigma))\left(x-\epsilon_{\mathrm{cyc}}(\sigma)\right)(\bmod \mathbf{I})(x−χ(σ))(x−ϵcyc(σ))(modI). In particular, if χ ( σ ) ϵ c y c ( σ ) ( mod m ) χ ( σ ) ≢ ϵ c y c ( σ ) ( mod m ) chi(sigma)≢epsilon_(cyc)(sigma)(modm)\chi(\sigma) \not \equiv \epsilon_{\mathrm{cyc}}(\sigma)(\bmod \mathfrak{m})χ(σ)≢ϵcyc(σ)(modm), then by Hensel's lemma ρ ( σ ) ρ ( σ ) rho(sigma)\rho(\sigma)ρ(σ) has distinct eigenvalues λ 1 , λ 2 T m λ 1 , λ 2 ∈ T m lambda_(1),lambda_(2)inT_(m)\lambda_{1}, \lambda_{2} \in \mathbf{T}_{\mathfrak{m}}λ1,λ2∈Tm such that λ 1 ϵ cyc k 1 ( σ ) ( mod I ) λ 1 ≡ ϵ cyc  k − 1 ( σ ) ( mod I ) lambda_(1)-=epsilon_("cyc ")^(k-1)(sigma)(mod I)\lambda_{1} \equiv \epsilon_{\text {cyc }}^{k-1}(\sigma)(\bmod I)λ1≡ϵcyc k−1(σ)(modI) and λ 2 χ ( σ ) ( mod I ) λ 2 ≡ χ ( σ ) ( mod I ) lambda_(2)-=chi(sigma)(modI)\lambda_{2} \equiv \chi(\sigma)(\bmod \mathbf{I})λ2≡χ(σ)(modI).
To define a convenient basis for ρ ρ rho\rhoρ, we choose τ G F Ï„ ∈ G F tau inG_(F)\tau \in G_{F}τ∈GF such that:
(1) τ Ï„ tau\tauÏ„ restricts to the complex conjugation of G G GGG,
(2) for each q p q ∣ p q∣pq \mid pq∣p, the eigenspace of ρ | G q ρ G q rho|_(G_(q))\left.\rho\right|_{G_{q}}ρ|Gq projected to each factor of K K KKK is not stable under ρ ( τ ) ρ ( Ï„ ) rho(tau)\rho(\tau)ρ(Ï„).
See [17, PRopositIoN 9.3] for the existence of such τ Ï„ tau\tauÏ„. Since p 2 p ≠ 2 p!=2p \neq 2p≠2, we have
χ ( τ ) = 1 1 ϵ c y c ( τ ) ( mod m ) χ ( Ï„ ) = − 1 ≢ 1 ≡ ϵ c y c ( Ï„ ) ( mod m ) chi(tau)=-1≢1-=epsilon_(cyc)(tau)(modm)\chi(\tau)=-1 \not \equiv 1 \equiv \epsilon_{\mathrm{cyc}}(\tau)(\bmod \mathfrak{m})χ(Ï„)=−1≢1≡ϵcyc(Ï„)(modm)
It follows from the discussion above that the eigenvalues of ρ ( τ ) ρ ( Ï„ ) rho(tau)\rho(\tau)ρ(Ï„) satisfy λ 1 ϵ c y c ( σ ) ( mod I ) λ 1 ≡ ϵ c y c ( σ ) ( mod I ) lambda_(1)-=epsilon_(cyc)(sigma)(modI)\lambda_{1} \equiv \epsilon_{\mathrm{cyc}}(\sigma)(\bmod \mathbf{I})λ1≡ϵcyc(σ)(modI) and λ 2 1 ( mod I ) λ 2 ≡ − 1 ( mod I ) lambda_(2)-=-1(modI)\lambda_{2} \equiv-1(\bmod \mathbf{I})λ2≡−1(modI). Fix the basis consisting of eigenvectors of ρ ( τ ) ρ ( Ï„ ) rho(tau)\rho(\tau)ρ(Ï„), say
ρ ( τ ) = ( λ 1 0 0 λ 2 ) ρ ( Ï„ ) = λ 1 0 0 λ 2 rho(tau)=([lambda_(1),0],[0,lambda_(2)])\rho(\tau)=\left(\begin{array}{cc} \lambda_{1} & 0 \\ 0 & \lambda_{2} \end{array}\right)ρ(Ï„)=(λ100λ2)
For a general σ G F σ ∈ G F sigma inG_(F)\sigma \in G_{F}σ∈GF, write
ρ ( σ ) = ( a ( σ ) b ( σ ) c ( σ ) d ( σ ) ) ρ ( σ ) = a ( σ ) b ( σ ) c ( σ ) d ( σ ) rho(sigma)=([a(sigma),b(sigma)],[c(sigma),d(sigma)])\rho(\sigma)=\left(\begin{array}{ll} a(\sigma) & b(\sigma) \\ c(\sigma) & d(\sigma) \end{array}\right)ρ(σ)=(a(σ)b(σ)c(σ)d(σ))
For each q p q ∣ p q∣p\mathfrak{q} \mid pq∣p, there is a change of basis matrix
M q = ( A q B q C q D q ) G L 2 ( K ) M q = A q B q C q D q ∈ G L 2 ( K ) M_(q)=([A_(q),B_(q)],[C_(q),D_(q)])inGL_(2)(K)M_{\mathfrak{q}}=\left(\begin{array}{ll} A_{\mathfrak{q}} & B_{\mathfrak{q}} \\ C_{\mathfrak{q}} & D_{\mathfrak{q}} \end{array}\right) \in \mathrm{GL}_{2}(K)Mq=(AqBqCqDq)∈GL2(K)
satisfying
(6.14) ( a ( σ ) b ( σ ) c ( σ ) d ( σ ) ) M q = M q ( χ ε c y c k 1 η q 1 0 η q ) (6.14) a ( σ ) b ( σ ) c ( σ ) d ( σ ) M q = M q χ ε c y c k − 1 η q − 1 ∗ 0 η q {:(6.14)([a(sigma),b(sigma)],[c(sigma),d(sigma)])M_(q)=M_(q)([chiepsi_(cyc)^(k-1)eta_(q)^(-1),**],[0,eta_(q)]):}\left(\begin{array}{ll} a(\sigma) & b(\sigma) \tag{6.14}\\ c(\sigma) & d(\sigma) \end{array}\right) M_{\mathfrak{q}}=M_{\mathfrak{q}}\left(\begin{array}{cc} \chi \varepsilon_{\mathrm{cyc}}^{k-1} \eta_{\mathcal{q}}^{-1} & * \\ 0 & \eta_{\mathcal{q}} \end{array}\right)(6.14)(a(σ)b(σ)c(σ)d(σ))Mq=Mq(χεcyck−1ηq−1∗0ηq)
The second condition in the choice of τ Ï„ tau\tauÏ„ ensures that A q , C q K A q , C q ∈ K ∗ A_(q),C_(q)inK^(**)A_{\mathfrak{q}}, C_{\mathfrak{q}} \in K^{*}Aq,Cq∈K∗. Furthermore, equating the upper left-hand entries in (6.14) gives
(6.15) b ( σ ) = A q C q ( a ( σ ) χ ε c y c k 1 η q 1 ( σ ) ) for all σ G q (6.15) b ( σ ) = A q C q a ( σ ) − χ ε c y c k − 1 η q − 1 ( σ )  for all  σ ∈ G q {:(6.15)b(sigma)=(A_(q))/(C_(q))(a(sigma)-chiepsi_(cyc)^(k-1)eta_(q)^(-1)(sigma))quad" for all "sigma inG_(q):}\begin{equation*} b(\sigma)=\frac{A_{\mathfrak{q}}}{C_{\boldsymbol{q}}}\left(a(\sigma)-\chi \varepsilon_{\mathrm{cyc}}^{k-1} \eta_{\mathrm{q}}^{-1}(\sigma)\right) \quad \text { for all } \sigma \in G_{\mathrm{q}} \tag{6.15} \end{equation*}(6.15)b(σ)=AqCq(a(σ)−χεcyck−1ηq−1(σ)) for all σ∈Gq

6.6. Galois representations to Galois cohomology classes

We summarize [17, §9.3]. As explained above, we have
(6.16) a ( σ ) + d ( σ ) χ ( σ ) + ϵ c y c k 1 ( σ ) ( mod I ) for all σ G F (6.16) a ( σ ) + d ( σ ) ≡ χ ( σ ) + ϵ c y c k − 1 ( σ ) ( mod I )  for all  σ ∈ G F {:(6.16)a(sigma)+d(sigma)-=chi(sigma)+epsilon_(cyc)^(k-1)(sigma)(mod I)quad" for all "sigma inG_(F):}\begin{equation*} a(\sigma)+d(\sigma) \equiv \chi(\sigma)+\epsilon_{\mathrm{cyc}}^{k-1}(\sigma)(\bmod I) \quad \text { for all } \sigma \in G_{F} \tag{6.16} \end{equation*}(6.16)a(σ)+d(σ)≡χ(σ)+ϵcyck−1(σ)(modI) for all σ∈GF
Applying the same rule for τ σ Ï„ σ tau sigma\tau \sigmaτσ and noting that a ( τ σ ) = λ 1 a ( σ ) , d ( σ τ ) = λ 2 d ( σ ) a ( Ï„ σ ) = λ 1 a ( σ ) , d ( σ Ï„ ) = λ 2 d ( σ ) a(tau sigma)=lambda_(1)a(sigma),d(sigma tau)=lambda_(2)d(sigma)a(\tau \sigma)=\lambda_{1} a(\sigma), d(\sigma \tau)=\lambda_{2} d(\sigma)a(τσ)=λ1a(σ),d(στ)=λ2d(σ), we find
(6.17) a ( σ ) ϵ c y c k 1 ( τ ) d ( σ ) χ ( σ ) + ϵ c y c k 1 ( σ τ ) ( mod I ) (6.17) a ( σ ) ϵ c y c k − 1 ( Ï„ ) − d ( σ ) ≡ − χ ( σ ) + ϵ c y c k − 1 ( σ Ï„ ) ( mod I ) {:(6.17)a(sigma)epsilon_(cyc)^(k-1)(tau)-d(sigma)-=-chi(sigma)+epsilon_(cyc)^(k-1)(sigma tau)(mod I):}\begin{equation*} a(\sigma) \epsilon_{\mathrm{cyc}}^{k-1}(\tau)-d(\sigma) \equiv-\chi(\sigma)+\epsilon_{\mathrm{cyc}}^{k-1}(\sigma \tau)(\bmod I) \tag{6.17} \end{equation*}(6.17)a(σ)ϵcyck−1(Ï„)−d(σ)≡−χ(σ)+ϵcyck−1(στ)(modI)
Solving the congruences (6.16) and (6.17) and once again using the fact that ϵ c y c k 1 ( τ ) 1 ϵ c y c k − 1 ( Ï„ ) ≢ − 1 epsilon_(cyc)^(k-1)(tau)≢-1\epsilon_{\mathrm{cyc}}^{k-1}(\tau) \not \equiv-1ϵcyck−1(Ï„)≢−1 ( mod m ) ( mod m ) (modm)(\bmod \mathfrak{m})(modm) since p 2 p ≠ 2 p!=2p \neq 2p≠2, we find that a ( σ ) , d ( σ ) T m a ( σ ) , d ( σ ) ∈ T m a(sigma),d(sigma)inT_(m)a(\sigma), d(\sigma) \in \mathbf{T}_{\mathfrak{m}}a(σ),d(σ)∈Tm and
(6.18) a ( σ ) ε c y c k 1 ( σ ) ( mod I ) , d ( σ ) χ ( σ ) ( mod I ) for all σ G F (6.18) a ( σ ) ≡ ε c y c k − 1 ( σ ) ( mod I ) , d ( σ ) ≡ χ ( σ ) ( mod I )  for all  σ ∈ G F {:(6.18)a(sigma)-=epsi_(cyc)^(k-1)(sigma)(modI)","quad d(sigma)-=chi(sigma)(modI)quad" for all "sigma inG_(F):}\begin{equation*} a(\sigma) \equiv \varepsilon_{\mathrm{cyc}}^{k-1}(\sigma)(\bmod \mathbf{I}), \quad d(\sigma) \equiv \chi(\sigma)(\bmod \mathbf{I}) \quad \text { for all } \sigma \in G_{F} \tag{6.18} \end{equation*}(6.18)a(σ)≡εcyck−1(σ)(modI),d(σ)≡χ(σ)(modI) for all σ∈GF
Let B B BBB be the T m T m T_(m)\mathbf{T}_{\mathfrak{m}}Tm submodule of K K KKK generated by { b ( σ ) : σ G F } { A q C q : q b ( σ ) : σ ∈ G F ∪ A q C q : q ∈ {b(sigma):sigma inG_(F)}uu{(A_(q))/(C_(q)):q in:}\left\{b(\sigma): \sigma \in G_{F}\right\} \cup\left\{\frac{A_{\mathfrak{q}}}{C_{\boldsymbol{q}}}: q \in\right.{b(σ):σ∈GF}∪{AqCq:q∈ Σ S } Σ ∖ S ∞ {: Sigma\\S_(oo)}\left.\Sigma \backslash S_{\infty}\right\}Σ∖S∞}. We have ρ ( σ σ ) = ρ ( σ ) ρ ( σ ) ρ σ σ ′ = ρ ( σ ) ρ σ ′ rho(sigmasigma^('))=rho(sigma)rho(sigma^('))\rho\left(\sigma \sigma^{\prime}\right)=\rho(\sigma) \rho\left(\sigma^{\prime}\right)ρ(σσ′)=ρ(σ)ρ(σ′) for σ , σ G F σ , σ ′ ∈ G F sigma,sigma^(')inG_(F)\sigma, \sigma^{\prime} \in G_{F}σ,σ′∈GF. Equating the upper right entries and using equation (6.18), we obtain
(6.19) b ( σ σ ) = a ( σ ) b ( σ ) + b ( σ ) d ( σ ) ε c y c k 1 ( σ ) b ( σ ) + χ ( σ ) b ( σ ) ( mod I B ) (6.19) b σ σ ′ = a ( σ ) b σ ′ + b ( σ ) d σ ′ ≡ ε c y c k − 1 ( σ ) b σ ′ + χ σ ′ b ( σ ) ( mod I B ) {:(6.19)b(sigmasigma^('))=a(sigma)b(sigma^('))+b(sigma)d(sigma^('))-=epsi_(cyc)^(k-1)(sigma)b(sigma^('))+chi(sigma^('))b(sigma)(modIB):}\begin{equation*} b\left(\sigma \sigma^{\prime}\right)=a(\sigma) b\left(\sigma^{\prime}\right)+b(\sigma) d\left(\sigma^{\prime}\right) \equiv \varepsilon_{\mathrm{cyc}}^{k-1}(\sigma) b\left(\sigma^{\prime}\right)+\chi\left(\sigma^{\prime}\right) b(\sigma)(\bmod \mathbf{I} B) \tag{6.19} \end{equation*}(6.19)b(σσ′)=a(σ)b(σ′)+b(σ)d(σ′)≡εcyck−1(σ)b(σ′)+χ(σ′)b(σ)(modIB)
Let m m mmm be am integer such that k 1 ( mod ( p 1 ) p m ) k ≡ 1 mod ( p − 1 ) p m k-=1(mod(p-1)p^(m))k \equiv 1\left(\bmod (p-1) p^{m}\right)k≡1(mod(p−1)pm). Let I q I q I_(q)I_{\mathfrak{q}}Iq denote the inertia subgroup of G F G F G_(F)G_{F}GF of a prime q. Put B 1 B 1 B_(1)B_{1}B1 for the T m T m T_(m)\mathbf{T}_{\mathfrak{m}}Tm-submodule of B B BBB generated by
I B p m B { b ( σ ) : σ I q for q p , q Σ } I B ∪ p m B ∪ b ( σ ) : σ ∈ I q  for  q ∣ p , q ∉ Σ IB uup^(m)B uu{b(sigma):sigma inI_(q)" for "q∣p,q!in Sigma}\mathbf{I} B \cup p^{m} B \cup\left\{b(\sigma): \sigma \in I_{\mathrm{q}} \text { for } \mathfrak{q} \mid p, q \notin \Sigma\right\}IB∪pmB∪{b(σ):σ∈Iq for q∣p,q∉Σ}
Define B ¯ = B / B 1 B ¯ = B / B 1 bar(B)=B//B_(1)\bar{B}=B / B_{1}B¯=B/B1. Equation (6.19) then gives that κ ( σ ) = χ 1 ( σ ) b ( σ ) κ ( σ ) = χ − 1 ( σ ) b ( σ ) kappa(sigma)=chi^(-1)(sigma)b(sigma)\kappa(\sigma)=\chi^{-1}(\sigma) b(\sigma)κ(σ)=χ−1(σ)b(σ) is a cocyle defining a cohomology class [ κ ] [ κ ] [kappa][\kappa][κ] in H 1 ( G F , B ¯ ( χ 1 ) ) H 1 G F , B ¯ χ − 1 H^(1)(G_(F),( bar(B))(chi^(-1)))H^{1}\left(G_{F}, \bar{B}\left(\chi^{-1}\right)\right)H1(GF,B¯(χ−1)) satisfying the following local properties:
(1) As ρ ρ rho\rhoρ is unramified at Υ π p Î¥ ∤ Ï€ p Υ∤pi p\mathfrak{\Upsilon} \nmid \pi pΥ∤πp, so is the class [ κ ] [ κ ] [kappa][\kappa][κ].
(2) As B ¯ B ¯ bar(B)\bar{B}B¯ is pro- p p ppp, the class [ κ ] [ κ ] [kappa][\kappa][κ] is at most tamely ramified at any prime l n l ∣ n l∣n\mathfrak{l} \mid \mathfrak{n}l∣n not above p p ppp.
(3) It is proven in [ 17 , 84.1 ] [ 17 , 84.1 ] [17,84.1][17,84.1][17,84.1] that we may assume Σ Î£ ′ Sigma^(')\Sigma^{\prime}Σ′ does not contain any primes above p p ppp. Thus [ κ ] [ κ ] [kappa][\kappa][κ] is at most tamely ramified at all primes in Σ Î£ ′ Sigma^(')\Sigma^{\prime}Σ′.
(4) By the definition of B 1 B 1 B_(1)B_{1}B1, where we have included b ( I q ) b I q b(I_(q))b\left(I_{q}\right)b(Iq) for primes q p , q Σ q ∣ p , q ∉ Σ q∣p,q!in Sigmaq \mid p, q \notin \Sigmaq∣p,q∉Σ, the class [ κ ] [ κ ] [kappa][\kappa][κ] is unramified at such q q qqq.
(5) Equation (6.15) implies that [ κ ] [ κ ] [kappa][\kappa][κ] is locally trivial at finite primes in Σ Î£ Sigma\SigmaΣ. As p p ppp is odd, [ κ ] [ κ ] [kappa][\kappa][κ] is locally trivial at archimedian places [17, PROPOSITION 9.5].

6.7. Galois cohomology classes to class groups

The Galois cohomology class [ κ ] [ κ ] [kappa][\kappa][κ] satisfies the conditions listed after equation (6.2) and hence gives a surjection
Σ Σ ( H ) p B ¯ ( χ 1 ) ∇ Σ Σ ′ ( H ) p − → B ¯ χ − 1 grad_(Sigma)^(Sigma^('))(H)_(p)^(-)rarr bar(B)(chi^(-1))\nabla_{\Sigma}^{\Sigma^{\prime}}(H)_{p}^{-} \rightarrow \bar{B}\left(\chi^{-1}\right)∇ΣΣ′(H)p−→B¯(χ−1)
For details see [18, THEOREM 4.4]. The general properties of Fitting ideals imply
Fitt R p ( Σ Σ ( H ) p ) Fitt R p ( B ¯ ( χ 1 ) ) Fitt R p ⁡ ∇ Σ Σ ′ ( H ) p − ⊂ Fitt R p ⁡ B ¯ χ − 1 Fitt_(R_(p))(grad_(Sigma)^(Sigma^('))(H)_(p)^(-))subFitt_(R_(p))(( bar(B))(chi^(-1)))\operatorname{Fitt}_{R_{p}}\left(\nabla_{\Sigma}^{\Sigma^{\prime}}(H)_{p}^{-}\right) \subset \operatorname{Fitt}_{R_{p}}\left(\bar{B}\left(\chi^{-1}\right)\right)FittRp⁡(∇ΣΣ′(H)p−)⊂FittRp⁡(B¯(χ−1))
It is therefore enough to prove that Fitt R p ( B ¯ ) ( Θ # ) Fitt R p ⁡ ( B ¯ ) ⊂ Θ # Fitt_(R_(p))( bar(B))sub(Theta^(#))\operatorname{Fitt}_{R_{p}}(\bar{B}) \subset\left(\Theta^{\#}\right)FittRp⁡(B¯)⊂(Θ#). Typically in Ribet's method, one argues that the fractional ideal B B BBB is a faithful T m T m T_(m)\mathbf{T}_{\mathfrak{m}}Tm-module, and hence the Fitting ideal of B / I B B / I B B//IBB / \mathbf{I} BB/IB is contained in I I I\mathbf{I}I. However, our module B ¯ B ¯ bar(B)\bar{B}B¯ is more complicated than B / I B B / I B B//IBB / \mathbf{I} BB/IB, so we proceed as follows. Using equation (6.15), we show that any element in Fitt R p ( B ¯ ) Fitt R p ⁡ ( B ¯ ) Fitt_(R_(p))( bar(B))\operatorname{Fitt}_{R_{p}}(\bar{B})FittRp⁡(B¯) is annihilated by φ ( U ) φ ( U ) varphi(U)\varphi(U)φ(U) for the operator U U UUU from Theorem 6.6. The final assertion of this theorem then implies that Fitt R p ( B ¯ ) Fitt R p ⁡ ( B ¯ ) Fitt_(R_(p))( bar(B))\operatorname{Fitt}_{R_{p}}(\bar{B})FittRp⁡(B¯) contained in ( Θ # ) Θ # (Theta^(#))\left(\Theta^{\#}\right)(Θ#). See [17, §9.5] for details.
This concludes our summary of the proof of Theorem 5.6.

7. EXPLICIT FORMULA FOR BRUMER-STARK UNITS

In this final section of the paper, we discuss the first author's explicit formula for Brumer-Stark units as mentioned in § 4.3 § 4.3 §4.3\S 4.3§4.3. The conjecture in the case that F F FFF is a real quadratic field was studied in [13], and the general case was studied in [15]. Here we consider an arbitrary totally real field F F FFF, but to simplify formulas we assume that the rational prime p p ppp is inert in F F FFF. Furthermore, we let H H HHH be the narrow ray class field of some conductor n O F n ⊂ O F nsubO_(F)\mathfrak{n} \subset O_{F}n⊂OF and assume that p 1 ( mod n ) p ≡ 1 ( mod n ) p-=1(modn)p \equiv 1(\bmod \mathfrak{n})p≡1(modn). This ensures that the prime p = p O F p = p O F p=pO_(F)\mathfrak{p}=p O_{F}p=pOF splits completely in H H HHH. Fix a prime P P P\mathfrak{P}P of H H HHH above p p ppp. We fix S S S ram = { v n } S ⊃ S ∞ ∪ S ram  = { v ∣ n ∞ } S supS_(oo)uuS_("ram ")={v∣noo}S \supset S_{\infty} \cup S_{\text {ram }}=\{v \mid \mathfrak{n} \infty\}S⊃S∞∪Sram ={v∣n∞}. We also fix a prime ideal l O F l ⊂ O F lsubO_(F)\mathfrak{l} \subset O_{F}l⊂OF such that N Y = > n + 1 N Y = â„“ > n + 1 NY=â„“ > n+1\mathrm{N} \mathfrak{Y}=\ell>n+1NY=â„“>n+1 is a prime integer and let T = { l } T = { l } T={l}T=\{\mathfrak{l}\}T={l}.
In this setting, we will present a p p ppp-adic analytic formula for the image of the BrumerStark unit u p H u p ∈ H ∗ u_(p)inH^(**)u_{\mathfrak{p}} \in H^{*}up∈H∗ in H P F p H P ∗ ≅ F p ∗ H_(P)^(**)~=F_(p)^(**)H_{\mathfrak{P}}^{*} \cong F_{\mathfrak{p}}^{*}HP∗≅Fp∗. The most general, conceptually satisfying, and theoretically useful form of this conjecture uses the Eisenstein cocycle. This is a class in the ( n 1 ) ( n − 1 ) (n-1)(n-1)(n−1) st cohomology of G L n ( Z ) G L n ( Z ) GL_(n)(Z)\mathrm{GL}_{n}(\mathbf{Z})GLn(Z) that has many avatars studied by several authors (see [ 2 , 3 , 9 , 10 , 13 [ 2 , 3 , 9 , 10 , 13 [2,3,9,10,13[2,3,9,10,13[2,3,9,10,13,
20, 45]). In this paper, we avoid defining the Eisenstein cocycle and present instead the more explicit and down to earth version of the conjectural formula for u p u p u_(p)u_{\mathfrak{p}}up stated in [15].

7.1. Shintani's method

Fixing an ordering of the n n nnn real embeddings of F F FFF yields an map F R n F ↪ R n F↪R^(n)F \hookrightarrow \mathbf{R}^{n}F↪Rn such that the image of any fractional ideal is a cocompact lattice. We let F F ∗ F^(**)F^{*}F∗ act on R n R n R^(n)\mathbf{R}^{n}Rn by composing this embedding with componentwise multiplication and denote the action by ∗ ***∗.
Let v 1 , , v r ( R > 0 ) n , 1 r n v 1 , … , v r ∈ R > 0 n , 1 ≤ r ≤ n v_(1),dots,v_(r)in(R^( > 0))^(n),1 <= r <= nv_{1}, \ldots, v_{r} \in\left(\mathbf{R}^{>0}\right)^{n}, 1 \leq r \leq nv1,…,vr∈(R>0)n,1≤r≤n, be vectors in the totally positive orthant that are linearly independent over R R R\mathbf{R}R. The corresponding simplicial cone is defined by
C ( v 1 , , v r ) = { i = 1 r t i v i : 0 < t i } ( R > 0 ) n C v 1 , … , v r = ∑ i = 1 r   t i v i : 0 < t i ⊂ R > 0 n C(v_(1),dots,v_(r))={sum_(i=1)^(r)t_(i)v_(i):0 < t_(i)}sub(R^( > 0))^(n)C\left(v_{1}, \ldots, v_{r}\right)=\left\{\sum_{i=1}^{r} t_{i} v_{i}: 0<t_{i}\right\} \subset\left(\mathbf{R}^{>0}\right)^{n}C(v1,…,vr)={∑i=1rtivi:0<ti}⊂(R>0)n
Suppose now r = n r = n r=nr=nr=n. We will define a certain union of C ( v 1 , , v n ) C v 1 , … , v n C(v_(1),dots,v_(n))C\left(v_{1}, \ldots, v_{n}\right)C(v1,…,vn) and some of its boundary faces that we call the Colmez closure. Write
( 0 , 0 , , 1 ) = i = 1 n q i v i , q i R ( 0 , 0 , … , 1 ) = ∑ i = 1 n   q i v i , q i ∈ R (0,0,dots,1)=sum_(i=1)^(n)q_(i)v_(i),quadq_(i)inR(0,0, \ldots, 1)=\sum_{i=1}^{n} q_{i} v_{i}, \quad q_{i} \in \mathbf{R}(0,0,…,1)=∑i=1nqivi,qi∈R
For each nonempty subset J { 1 , , n } J ⊂ { 1 , … , n } J sub{1,dots,n}J \subset\{1, \ldots, n\}J⊂{1,…,n}, we say that J J JJJ is positive if q i > 0 q i > 0 q_(i) > 0q_{i}>0qi>0 for all i J i ∉ J i!in Ji \notin Ji∉J. The Colmez closure of C ( v 1 , , v n ) C v 1 , … , v n C(v_(1),dots,v_(n))C\left(v_{1}, \ldots, v_{n}\right)C(v1,…,vn) is defined by:
C ( v 1 , , v n ) = J positive C ( { v j , j J } ) C ∗ v 1 , … , v n = ⨆ J  positive    C v j , j ∈ J C^(**)(v_(1),dots,v_(n))=⨆_(J" positive ")C({v_(j),j in J})C^{*}\left(v_{1}, \ldots, v_{n}\right)=\bigsqcup_{J \text { positive }} C\left(\left\{v_{j}, j \in J\right\}\right)C∗(v1,…,vn)=⨆J positive C({vj,j∈J})
Let E ( n ) O F E ( n ) ⊂ O F ∗ E(n)subO_(F)^(**)E(\mathfrak{n}) \subset O_{F}^{*}E(n)⊂OF∗ denote the subgroup of totally positive units ϵ ϵ epsilon\epsilonϵ such that ϵ 1 ϵ ≡ 1 epsilon-=1\epsilon \equiv 1ϵ≡1 ( mod n ) ( mod n ) (modn)(\bmod \mathfrak{n})(modn). Shintani proved that there exists a union of simplicial cones that is a fundamental domain for the action of E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n) on ( R > 0 ) n R > 0 n (R^( > 0))^(n)\left(\mathbf{R}^{>0}\right)^{n}(R>0)n. For example, in the real quadratic case ( n = 2 ) ( n = 2 ) (n=2)(n=2)(n=2), E ( n ) = ϵ E ( n ) = ⟨ ϵ ⟩ E(n)=(:epsilon:)E(\mathfrak{n})=\langle\epsilon\rangleE(n)=⟨ϵ⟩ is cyclic and C ( 1 , ϵ ) C ∗ ( 1 , ϵ ) C^(**)(1,epsilon)C^{*}(1, \epsilon)C∗(1,ϵ) is a fundamental domain. In the general case, it can be difficult to write down an explicit fundamental domain, but a nice generalization of the n = 2 n = 2 n=2n=2n=2 case is obtained if we allow ourselves to consider instead a signed fundamental domain. For a simplicial cone C C CCC, let 1 C 1 C 1_(C)\mathbf{1}_{C}1C denote the characteristic function of C C CCC on ( R > 0 ) n R > 0 n (R^( > 0))^(n)\left(\mathbf{R}^{>0}\right)^{n}(R>0)n.
Definition 7.1. A signed fundamental domain for the action of E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n) on ( R > 0 ) n R > 0 n (R^( > 0))^(n)\left(\mathbf{R}^{>0}\right)^{n}(R>0)n is by definition a formal linear combination D = i a i C i D = ∑ i   a i C i D=sum_(i)a_(i)C_(i)D=\sum_{i} a_{i} C_{i}D=∑iaiCi of simplicial cones C i C i C_(i)C_{i}Ci with a i Z a i ∈ Z a_(i)inZa_{i} \in \mathbf{Z}ai∈Z such that
u E ( n ) i a i 1 C i ( u x ) = 1 ∑ u ∈ E ( n )   ∑ i   a i 1 C i ( u ∗ x ) = 1 sum_(u in E(n))sum_(i)a_(i)1_(C_(i))(u**x)=1\sum_{u \in E(\mathfrak{n})} \sum_{i} a_{i} \mathbf{1}_{C_{i}}(u * x)=1∑u∈E(n)∑iai1Ci(u∗x)=1
for all x ( R > 0 ) n x ∈ R > 0 n x in(R_( > 0))^(n)x \in\left(\mathbf{R}_{>0}\right)^{n}x∈(R>0)n.
Fix an ordered basis { ϵ 1 , , ϵ n 1 } ϵ 1 , … , ϵ n − 1 {epsilon_(1),dots,epsilon_(n-1)}\left\{\epsilon_{1}, \ldots, \epsilon_{n-1}\right\}{ϵ1,…,ϵn−1} for E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n). Define the orientation
(7.1) w ϵ = sign det ( log ( ϵ i j ) ) i , j = 1 n 1 ) = ± 1 (7.1) w ϵ = sign ⁡ det ⁡ log ⁡ ϵ i j i , j = 1 n − 1 = ± 1 {:(7.1){:w_(epsilon)=sign det (log(epsilon_(ij)))_(i,j=1)^(n-1))=+-1:}\begin{equation*} \left.w_{\epsilon}=\operatorname{sign} \operatorname{det}\left(\log \left(\epsilon_{i j}\right)\right)_{i, j=1}^{n-1}\right)= \pm 1 \tag{7.1} \end{equation*}(7.1)wϵ=sign⁡det⁡(log⁡(ϵij))i,j=1n−1)=±1
where ϵ i j ϵ i j epsilon_(ij)\epsilon_{i j}ϵij denotes the j j jjj th coordinate of ϵ i ϵ i epsilon_(i)\epsilon_{i}ϵi. For each permutation σ S n 1 σ ∈ S n − 1 sigma inS_(n-1)\sigma \in S_{n-1}σ∈Sn−1 let
v i , σ = ϵ σ ( 1 ) ϵ σ ( i 1 ) E ( n ) , i = 1 , , n v i , σ = ϵ σ ( 1 ) ⋯ ϵ σ ( i − 1 ) ∈ E ( n ) , i = 1 , … , n v_(i,sigma)=epsilon_(sigma(1))cdotsepsilon_(sigma(i-1))in E(n),quad i=1,dots,nv_{i, \sigma}=\epsilon_{\sigma(1)} \cdots \epsilon_{\sigma(i-1)} \in E(\mathfrak{n}), \quad i=1, \ldots, nvi,σ=ϵσ(1)⋯ϵσ(i−1)∈E(n),i=1,…,n
By convention, v 1 , σ = ( 1 , 1 , , 1 ) v 1 , σ = ( 1 , 1 , … , 1 ) v_(1,sigma)=(1,1,dots,1)v_{1, \sigma}=(1,1, \ldots, 1)v1,σ=(1,1,…,1) for all σ σ sigma\sigmaσ. Define
w σ = ( 1 ) n 1 w ϵ sign ( σ ) sign ( det ( v i , σ ) i = 1 n ) { 0 , ± 1 } w σ = ( − 1 ) n − 1 w ϵ sign ⁡ ( σ ) sign ⁡ det ⁡ v i , σ i = 1 n ∈ { 0 , ± 1 } w_(sigma)=(-1)^(n-1)w_(epsilon)sign(sigma)sign(det (v_(i,sigma))_(i=1)^(n))in{0,+-1}w_{\sigma}=(-1)^{n-1} w_{\epsilon} \operatorname{sign}(\sigma) \operatorname{sign}\left(\operatorname{det}\left(v_{i, \sigma}\right)_{i=1}^{n}\right) \in\{0, \pm 1\}wσ=(−1)n−1wϵsign⁡(σ)sign⁡(det⁡(vi,σ)i=1n)∈{0,±1}
The following result was proven independently by Diaz y Diaz-Friedman [22] and Charollois-Dasgupta-Greenberg [10, THEOREM 1.5], generalizing the result of Colmez [11] in the case that all w σ = 1 w σ = 1 w_(sigma)=1w_{\sigma}=1wσ=1.

Theorem 7.2. The formal linear combination

σ S n 1 w σ C ( v 1 , σ , , v n , σ ) ∑ σ ∈ S n − 1   w σ C ∗ v 1 , σ , … , v n , σ sum_(sigma inS_(n-1))w_(sigma)C^(**)(v_(1,sigma),dots,v_(n,sigma))\sum_{\sigma \in S_{n-1}} w_{\sigma} C^{*}\left(v_{1, \sigma}, \ldots, v_{n, \sigma}\right)∑σ∈Sn−1wσC∗(v1,σ,…,vn,σ)
is a signed fundamental domain for the action of E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n) on ( R > 0 ) n R > 0 n (R^( > 0))^(n)\left(\mathbf{R}^{>0}\right)^{n}(R>0)n.

7.2. The formula

Throughout this section assume that p p ppp is odd. Recall that T = { l } T = { l } T={l}T=\{\mathfrak{l}\}T={l}. Let b b b\mathfrak{b}b be a fractional ideal that is relatively prime to n l n l nl\mathfrak{n l}nl. Let D = a i C i D = ∑ a i C i D=suma_(i)C_(i)D=\sum a_{i} C_{i}D=∑aiCi be the signed fundamental domain for the action of E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n) on ( R > 0 ) n R > 0 n (R^( > 0))^(n)\left(\mathbf{R}^{>0}\right)^{n}(R>0)n given in Theorem 7.2. We use all this data to define a Z Z Z\mathbf{Z}Z-valued measure μ μ mu\muμ on O p O p O_(p)O_{p}Op, the p p ppp-adic completion of O F O F O_(F)O_{F}OF. Fix an element z b 1 z ∈ b − 1 z inb^(-1)z \in \mathfrak{b}^{-1}z∈b−1 such that z 1 ( mod n ) z ≡ 1 ( mod n ) z-=1(modn)z \equiv 1(\bmod \mathfrak{n})z≡1(modn). For each compact open set U O p U ⊂ O p U subO_(p)U \subset O_{p}U⊂Op, define the Shintani zeta function
ζ ( b , U , D , s ) = i a i α C i b 1 α U , ( α , S ) = 1 ( N α ) s ζ ( b , U , D , s ) = ∑ i   a i ∑ α ∈ C i ∩ b − 1 α ∈ U , ( α , S ) = 1   ( N α ) − s zeta(b,U,D,s)=sum_(i)a_(i)sum_({:[alpha inC_(i)nnb-1],[alpha in U","(alpha","S)=1]:})(Nalpha)^(-s)\zeta(\mathfrak{b}, U, D, s)=\sum_{i} a_{i} \sum_{\substack{\alpha \in C_{i} \cap \mathfrak{b}-1 \\ \alpha \in U,(\alpha, S)=1}}(\mathrm{~N} \alpha)^{-s}ζ(b,U,D,s)=∑iai∑α∈Ci∩b−1α∈U,(α,S)=1( Nα)−s
Shintani proved that this sum converges for ( s ) ℜ ( s ) ℜ(s)\Re(s)ℜ(s) large enough and extends to a meromorphic function on C C C\mathbf{C}C. Define
μ b ( U ) = ζ ( b , U , D , 0 ) ζ ( b L 1 , U , D , 0 ) μ b ( U ) = ζ ( b , U , D , 0 ) − â„“ â‹… ζ b L − 1 , U , D , 0 mu_(b)(U)=zeta(b,U,D,0)-â„“*zeta(bL^(-1),U,D,0)\mu_{\mathfrak{b}}(U)=\zeta(\mathfrak{b}, U, D, 0)-\ell \cdot \zeta\left(\mathfrak{b} \mathfrak{L}^{-1}, U, D, 0\right)μb(U)=ζ(b,U,D,0)−ℓ⋅ζ(bL−1,U,D,0)
Using Shintani's formulas, one may show:
Theorem 7.3 ([15, PRoPosition 3.12]). For every compact open U O p U ⊂ O p U subO_(p)U \subset O_{p}U⊂Op, we have μ b ( U ) Z μ b ( U ) ∈ Z mu_(b)(U)inZ\mu_{\mathfrak{b}}(U) \in \mathbf{Z}μb(U)∈Z.
We may now state our conjectural exact formula for the Brumer-Stark unit u p u p u_(p)u_{p}up and all of its conjugates over F F FFF. Write
Θ S , T = σ G ζ S , T ( σ ) σ 1 , ζ S , T ( σ ) Z Θ S , T = ∑ σ ∈ G   ζ S , T ( σ ) σ − 1 , ζ S , T ( σ ) ∈ Z Theta_(S,T)=sum_(sigma in G)zeta_(S,T)(sigma)sigma^(-1),quadzeta_(S,T)(sigma)inZ\Theta_{S, T}=\sum_{\sigma \in G} \zeta_{S, T}(\sigma) \sigma^{-1}, \quad \zeta_{S, T}(\sigma) \in \mathbf{Z}ΘS,T=∑σ∈GζS,T(σ)σ−1,ζS,T(σ)∈Z
Define
(7.2) u p ( b ) a n = p ζ s , T ( σ b ) ψ O p x d μ b ( x ) F p (7.2) u p ( b ) a n = p ζ s , T σ b ψ O p ∗ x d μ b ( x ) ∈ F p ∗ {:(7.2)u_(p)(b)^(an)=p^(zeta s,T(sigma_(b)))psi_(O_(p)^(**))xdmu_(b)(x)inF_(p)^(**):}\begin{equation*} u_{p}(\mathfrak{b})^{\mathrm{an}}=p^{\zeta s, T\left(\sigma_{\mathfrak{b}}\right)} \psi_{O_{p}^{*}} x d \mu_{\mathfrak{b}}(x) \in F_{p}^{*} \tag{7.2} \end{equation*}(7.2)up(b)an=pζs,T(σb)ψOp∗xdμb(x)∈Fp∗
Here the crossed integral is a multiplicative integral in the sense of Darmon [12] and can be expressed as a limit of Riemann products:
f O p x d μ b ( x ) := lim m a ( O p / p m ) a μ F ( a + p m O p ) f O p ∗ x d μ b ( x ) := lim m → ∞   ∏ a ∈ O p / p m ∗   a μ F a + p m O p f_(O_(p)^(**))xdmu_(b)(x):=lim_(m rarr oo)prod_(a in(O_(p)//p^(m))^(**))a^(mu_(F)(a+p^(m)O_(p)))\mathcal{f}_{O_{p}^{*}} x d \mu_{\mathfrak{b}}(x):=\lim _{m \rightarrow \infty} \prod_{a \in\left(O_{p} / p^{m}\right)^{*}} a^{\mu_{\mathfrak{F}}\left(a+p^{m} O_{p}\right)}fOp∗xdμb(x):=limm→∞∏a∈(Op/pm)∗aμF(a+pmOp)
Write σ b G σ b ∈ G sigma_(b)in G\sigma_{\mathfrak{b}} \in Gσb∈G for the Frobenius associated to b b b\mathfrak{b}b. In [15, THEOREM 5.15] we prove that u p ( b ) a n u p ( b ) a n u_(p)(b)^(an)u_{p}(\mathfrak{b})^{\mathrm{an}}up(b)an depends only on the image of b b b\mathfrak{b}b in the narrow ray class group of conductor n n n\mathfrak{n}n, i.e., on σ b G σ b ∈ G sigma_(b)in G\sigma_{\mathfrak{b}} \in Gσb∈G (at least up to a root of unity in F p F p ∗ F_(p)^(**)F_{p}^{*}Fp∗ ).
Conjecture 7.4. We have σ b ( u p ) = u p ( b ) a n σ b u p = u p ( b ) a n sigma_(b)(u_(p))=u_(p)(b)^(an)\sigma_{\mathfrak{b}}\left(u_{\mathfrak{p}}\right)=u_{p}(\mathfrak{b})^{\mathrm{an}}σb(up)=up(b)an in F p F p ∗ F_(p)^(**)F_{p}^{*}Fp∗.
The expression (7.2) can be computed to high p p ppp-adic precision on a computer. See [23] for tables of narrow Hilbert class fields of real quadratic fields determined using this formula.
It is convenient to have an invariant that also satisfies u p ( b q ) a n = ( u p ( b ) a n ) 1 u p ( b q ) a n = u p ( b ) a n − 1 u_(p)(bq)^(an)=(u_(p)(b)^(an))^(-1)u_{p}(\mathfrak{b q})^{\mathrm{an}}=\left(u_{p}(\mathfrak{b})^{\mathrm{an}}\right)^{-1}up(bq)an=(up(b)an)−1 if q q q\mathfrak{q}q is a prime such that σ q = c σ q = c sigma_(q)=c\sigma_{q}=cσq=c. Conjecture 7.4 would imply such a formula, but it is unclear whether this purely analytic statement can be proved unconditionally. To this end, we fix q q qqq such that σ q ˙ = c σ q Ë™ = c sigma_(q^(Ë™))=c\sigma_{\dot{q}}=cσqË™=c and define
v p ( b ) a n = ( u p ( b ) a n u p ( b q ) a n ) 1 / 2 F ^ p := F p ^ Z p v p ( b ) a n = u p ( b ) a n u p ( b q ) a n 1 / 2 ∈ F ^ p ∗ := F p ∗ ⊗ ^ Z p v_(p)(b)^(an)=((u_(p)(b)^(an))/(u_(p)(bq)^(an)))^(1//2)in hat(F)_(p)^(**):=F_(p)^(**) hat(ox)Z_(p)v_{p}(\mathfrak{b})^{\mathrm{an}}=\left(\frac{u_{p}(\mathfrak{b})^{\mathrm{an}}}{u_{p}(\mathfrak{b q})^{\mathrm{an}}}\right)^{1 / 2} \in \hat{F}_{p}^{*}:=F_{p}^{*} \hat{\otimes} \mathbf{Z}_{p}vp(b)an=(up(b)anup(bq)an)1/2∈F^p∗:=Fp∗⊗^Zp
One then has
(7.3) v p ( b q ) a n = ( v p ( b ) a n ) 1 (7.3) v p ( b q ) a n = v p ( b ) a n − 1 {:(7.3)v_(p)(bq)^(an)=(v_(p)(b)^(an))^(-1):}\begin{equation*} v_{p}(\mathfrak{b q})^{\mathrm{an}}=\left(v_{p}(\mathfrak{b})^{\mathrm{an}}\right)^{-1} \tag{7.3} \end{equation*}(7.3)vp(bq)an=(vp(b)an)−1
unconditionally, and we expect to have v p ( b ) a n = u p ( b ) a n v p ( b ) a n = u p ( b ) a n v_(p)(b)^(an)=u_(p)(b)^(an)v_{p}(\mathfrak{b})^{\mathrm{an}}=u_{p}(\mathfrak{b})^{\mathrm{an}}vp(b)an=up(b)an. The following is therefore a slightly easier form of Conjecture 7.4 to study.
Conjecture 7.5. We have σ b ( u p ) = v p ( b ) a n σ b u p = v p ( b ) a n sigma_(b)(u_(p))=v_(p)(b)^(an)\sigma_{\mathfrak{b}}\left(u_{\mathfrak{p}}\right)=v_{p}(\mathfrak{b})^{\mathrm{an}}σb(up)=vp(b)an in F ^ p F ^ p ∗ hat(F)_(p)^(**)\hat{F}_{p}^{*}F^p∗.

7.3. Horizontal Iwasawa theory

We now discuss the relationship between Gross's tower of fields conjecture (Conjecture 4.4) and our conjectural exact formula for Brumer-Stark units. Our goal is to prove:
Theorem 7.6. Assume that p p ppp is odd. Gross's conjecture implies Conjecture 7.5.
In this exposition we have assumed that the odd prime p p ppp is inert in F F FFF and that p = p O F p = p O F p=pO_(F)p=p O_{F}p=pOF. In the case of general p p p\mathfrak{p}p, one must still assume that p p ppp is odd and unramified in F F FFF in the statement of Theorem 7.6.
The abelian extensions L / F L / F L//FL / FL/F to which we can apply Gross's conjecture (with S = S { p } S ′ = S ∪ { p } S^(')=S uu{p}S^{\prime}=S \cup\{\mathfrak{p}\}S′=S∪{p} ) as in Conjecture 4.4) are those that contain H H HHH and are unramified outside S S ′ ∞ S^(')ooS^{\prime} \inftyS′∞. Let F S F S ′ F_(S^('))F_{S^{\prime}}FS′ denote the maximal abelian extension of F F FFF unramified outside S S ′ ∞ S^(')ooS^{\prime} \inftyS′∞. The reciprocity map of class field theory yields an explicit description of Gal ( F S / H ) Gal ⁡ F S ′ / H Gal(F_(S^('))//H)\operatorname{Gal}\left(F_{S^{\prime}} / H\right)Gal⁡(FS′/H). For each finite v S v ∈ S ′ v inS^(')v \in S^{\prime}v∈S′, let U v , n O v U v , n ⊂ O v ∗ U_(v,n)subO_(v)^(**)U_{v, \mathfrak{n}} \subset O_{v}^{*}Uv,n⊂Ov∗ denote the subgroup of elements congruent to 1 modulo n O v n O v nO_(v)\mathfrak{n} O_{v}nOv (so U f , n = O v U f , n = O v ∗ U_(f,n)=O_(v)^(**)U_{f, \mathfrak{n}}=O_{v}^{*}Uf,n=Ov∗ for v n v ∤ n v∤nv \nmid \mathfrak{n}v∤n ). Define O = v S S U v , n O ∗ = ∏ v ∈ S ′ ∖ S ∞   U v , n O^(**)=prod_(v inS^(')\\S_(oo))U_(v,n)\mathbf{O}^{*}=\prod_{v \in S^{\prime} \backslash S_{\infty}} U_{v, \mathfrak{n}}O∗=∏v∈S′∖S∞Uv,n. Then
Gal ( F S / H ) O / E ( n ) ¯ Gal ⁡ F S ′ / H ≅ O ∗ / E ( n ) ¯ Gal(F_(S^('))//H)~=O^(**)// bar(E(n))\operatorname{Gal}\left(F_{S^{\prime}} / H\right) \cong \mathbf{O}^{*} / \overline{E(\mathfrak{n})}Gal⁡(FS′/H)≅O∗/E(n)¯
where E ( n ) ¯ E ( n ) ¯ bar(E(n))\overline{E(\mathfrak{n})}E(n)¯ denotes the topological closure of E ( n ) E ( n ) E(n)E(\mathfrak{n})E(n) embedded diagonally in O O ∗ O^(**)\mathbf{O}^{*}O∗.
For each finite extension L F S L ⊂ F S ′ L subF_(S^('))L \subset F_{S^{\prime}}L⊂FS′ containing H H HHH, if we write Γ = Gal ( L / H ) Γ = Gal ⁡ ( L / H ) Gamma=Gal(L//H)\Gamma=\operatorname{Gal}(L / H)Γ=Gal⁡(L/H), then (4.7) yields a formula for rec G ( u p ) rec G ⁡ u p rec_(G)(u_(p))\operatorname{rec}_{G}\left(u_{p}\right)recG⁡(up) in I / I 2 Z [ G ] Γ I / I 2 ≅ Z [ G ] ⊗ Γ I//I^(2)~=Z[G]ox GammaI / I^{2} \cong \mathbf{Z}[G] \otimes \GammaI/I2≅Z[G]⊗Γ. Under this isomorphism, the coefficient of σ b 1 σ b − 1 sigma_(b)^(-1)\sigma_{\mathfrak{b}}^{-1}σb−1 is just the image of rec p ( σ b ( u p ) ) rec p ⁡ σ b u p rec_(p)(sigma_(b)(u_(p)))\operatorname{rec}_{\mathfrak{p}}\left(\sigma_{\mathfrak{b}}\left(u_{p}\right)\right)recp⁡(σb(up)) in Γ Î“ Gamma\GammaΓ. Taking the inverse limit over all H L F S H ⊂ L ⊂ F S ′ H sub L subF_(S^('))H \subset L \subset F_{S^{\prime}}H⊂L⊂FS′ therefore gives an equality for
( σ b ( u p ) , 1 , 1 , , 1 ) in O / E ( n ) ¯ σ b u p , 1 , 1 , … , 1  in  O / E ( n ) ¯ (sigma_(b)(u_(p)),1,1,dots,1)quad" in "O// bar(E(n))\left(\sigma_{\mathfrak{b}}\left(u_{p}\right), 1,1, \ldots, 1\right) \quad \text { in } \mathbf{O} / \overline{E(\mathfrak{n})}(σb(up),1,1,…,1) in O/E(n)¯
Here we have written O p O p ∗ O_(p)^(**)O_{p}^{*}Op∗ as the first component of O O O\mathbf{O}O.
The next key point is that the constructions of Section 7.2 can be repeated to provide a measure μ b , O μ b , O mu_(b,O)\mu_{\mathfrak{b}, \mathbf{O}}μb,O on O = v S S O v O = ∏ v ∈ S ′ ∖ S ∞   O v O=prod_(v inS^(')\\S_(oo))O_(v)\mathbf{O}=\prod_{v \in S^{\prime} \backslash S_{\infty}} O_{v}O=∏v∈S′∖S∞Ov extending the measure μ b μ b mu_(b)\mu_{\mathfrak{b}}μb on O p O p O_(p)O_{p}Op. It is not hard to check that the restriction of μ b , o μ b , o mu_(b,o)\mu_{\mathfrak{b}, \mathbf{o}}μb,o to O O ∗ O^(**)\mathbf{O}^{*}O∗, pushed forward to O / E ( n ) ¯ O ∗ / E ( n ) ¯ O^(**)// bar(E(n))\mathbf{O}^{*} / \overline{E(\mathfrak{n})}O∗/E(n)¯, is precisely the measure that recovers the values of the partial zeta functions of the abelian extensions L L LLL contained in F S F S ′ F_(S^('))F_{S^{\prime}}FS′. These are exactly the values appearing in Gross's conjecture. In other words, Gross's conjecture for the set S S ′ S^(')S^{\prime}S′ is equivalent to
(7.4) ( σ b ( u p ) , 1 , 1 , , 1 ) p ζ S , T ( σ b ) = O x d μ b , O ( x ) in O / E ( n ) ¯ (7.4) σ b u p , 1 , 1 , … , 1 â‹… p − ζ S , T σ b = ∫ O ∗   x d μ b , O ( x )  in  O / E ( n ) ¯ {:(7.4)(sigma_(b)(u_(p)),1,1,dots,1)*p^(-zeta_(S,T)(sigma_(b)))=int_(O^(**))xdmu_(b,O)(x)quad" in "O// bar(E(n)):}\begin{equation*} \left(\sigma_{\mathfrak{b}}\left(u_{\mathfrak{p}}\right), 1,1, \ldots, 1\right) \cdot p^{-\zeta_{S, T}\left(\sigma_{\mathfrak{b}}\right)}=\int_{\mathbf{O}^{*}} x d \mu_{\mathfrak{b}, \mathbf{O}}(x) \quad \text { in } \mathbf{O} / \overline{E(\mathfrak{n})} \tag{7.4} \end{equation*}(7.4)(σb(up),1,1,…,1)â‹…p−ζS,T(σb)=∫O∗xdμb,O(x) in O/E(n)¯
See [15, PROPOSITION 3.4]. The next important calculation ([15, THEOREM 3.22]) is that
(7.5) p ζ S , T ( σ b ) O x d μ b , o ( x ) = ( u p ( b ) a n , 1 , 1 , , 1 ) (7.5) p ζ S , T σ b ∫ O ∗   x d μ b , o ( x ) = u p ( b ) a n , 1 , 1 , … , 1 {:(7.5)p^(zeta_(S,T)(sigma_(b)))int_(O^(**))xdmu_(b,o)(x)=(u_(p)(b)^(an),1,1,dots,1):}\begin{equation*} p^{\zeta_{S, T}\left(\sigma_{\mathfrak{b}}\right)} \int_{\mathbf{O}^{*}} x d \mu_{\mathfrak{b}, \mathbf{o}}(x)=\left(u_{p}(\mathfrak{b})^{\mathrm{an}}, 1,1, \ldots, 1\right) \tag{7.5} \end{equation*}(7.5)pζS,T(σb)∫O∗xdμb,o(x)=(up(b)an,1,1,…,1)
The first component of this is simply the compatibility of the constructions of μ b μ b mu_(b)\mu_{\mathfrak{b}}μb and μ b , o μ b , o mu_(b,o)\mu_{\mathfrak{b}, \mathbf{o}}μb,o; the interesting part of the computation is the 1's in the components away from p p ppp. Equations (7.4) and (7.5) combine to yield that the ratio σ b ( u p ) / u p ( b ) an σ b u p / u p ( b ) an  sigma_(b)(u_(p))//u_(p)(b)^("an ")\sigma_{\mathfrak{b}}\left(u_{p}\right) / u_{p}(\mathfrak{b})^{\text {an }}σb(up)/up(b)an  lies in the group
D ( S ) = { x O p : ( x , 1 , 1 , , 1 ) E ( n ) ¯ O } D ( S ) = x ∈ O p ∗ : ( x , 1 , 1 , … , 1 ) ∈ E ( n ) ¯ ⊂ O ∗ D(S)={x inO_(p)^(**):(x,1,1,dots,1)in bar(E(n))subO^(**)}D(S)=\left\{x \in O_{p}^{*}:(x, 1,1, \ldots, 1) \in \overline{E(\mathfrak{n})} \subset \mathbf{O}^{*}\right\}D(S)={x∈Op∗:(x,1,1,…,1)∈E(n)¯⊂O∗}
We can also conclude
(7.6) σ b ( u p ) / v p ( b ) a n D ( S ) (7.6) σ b u p / v p ( b ) a n ∈ D ( S ) {:(7.6)sigma_(b)(u_(p))//v_(p)(b)^(an)in D(S):}\begin{equation*} \sigma_{\mathfrak{b}}\left(u_{p}\right) / v_{p}(\mathfrak{b})^{\mathrm{an}} \in D(S) \tag{7.6} \end{equation*}(7.6)σb(up)/vp(b)an∈D(S)
since c ( u p ) = u p 1 c u p = u p − 1 c(u_(p))=u_(p)^(-1)c\left(u_{p}\right)=u_{p}^{-1}c(up)=up−1.
The final trick, inspired by the method of Taylor-Wiles, is to consider certain enlarged sets S Q = S Q S Q = S ∪ Q S_(Q)=S uu QS_{Q}=S \cup QSQ=S∪Q for a well-chosen finite set of auxiliary primes Q Q QQQ. Let us compare the Brumer-Stark units for S S SSS and S Q S Q S_(Q)S_{Q}SQ, denoted u p u p u_(p)u_{p}up and u p ( S Q ) u p S Q u_(p)(S_(Q))u_{p}\left(S_{Q}\right)up(SQ), respectively. The defining property (4.1) shows that
u p ( S Q ) = u p z , where z = q Q ( 1 σ q 1 ) Z [ G ] u p S Q = u p z ,  where  z = ∏ q ∈ Q   1 − σ q − 1 ∈ Z [ G ] u_(p)(S_(Q))=u_(p)^(z),quad" where "z=prod_(q in Q)(1-sigma_(q)^(-1))inZ[G]u_{p}\left(S_{Q}\right)=u_{p}^{z}, \quad \text { where } z=\prod_{q \in Q}\left(1-\sigma_{q}^{-1}\right) \in \mathbf{Z}[G]up(SQ)=upz, where z=∏q∈Q(1−σq−1)∈Z[G]
In particular, if we choose the q Q q ∈ Q q in Qq \in Qq∈Q such that σ q = c σ q = c sigma_(q)=c\sigma_{\mathfrak{q}}=cσq=c, the complex conjugation of G G GGG, then u p ( S Q ) = u p 2 2 Q u p S Q = u p 2 2 ∉ Q u_(p)(S_(Q))=u_(p)^(2^(2!in Q))u_{p}\left(S_{Q}\right)=u_{p}^{2^{2 \notin Q}}up(SQ)=up22∉Q. Using (7.3), one can similarly show that v p ( S Q , b ) = v p ( b 2 2 2 Q v p S Q , b = v p b 2 2 2 Q v_(p)(S_(Q),b)=v_(p)(b2^(2^(2Q)):}v_{p}\left(S_{Q}, \mathfrak{b}\right)=v_{p}\left(\mathfrak{b} 2^{2^{2 Q}}\right.vp(SQ,b)=vp(b222Q. Now (7.6) for S Q S Q S_(Q)S_{Q}SQ implies that
σ b ( u p ( S Q ) ) / v p ( S Q , b ) a n D ( S Q ) σ b u p S Q / v p S Q , b a n ∈ D S Q sigma_(b)(u_(p)(S_(Q)))//v_(p)(S_(Q),b)^(an)in D(S_(Q))\sigma_{\mathfrak{b}}\left(u_{p}\left(S_{Q}\right)\right) / v_{p}\left(S_{Q}, \mathfrak{b}\right)^{\mathrm{an}} \in D\left(S_{Q}\right)σb(up(SQ))/vp(SQ,b)an∈D(SQ)
hence
( σ b ( u p ) / v p ( b ) a n ) 2 # Q D ( S Q ) , so σ b ( u p ) / v p ( b ) a n D ( S Q ) σ b u p / v p ( b ) a n 2 # Q ∈ D S Q ,  so  σ b u p / v p ( b ) a n ∈ D S Q (sigma_(b)(u_(p))//v_(p)(b)^(an))^(2^(#Q))in D(S_(Q)),quad" so "sigma_(b)(u_(p))//v_(p)(b)^(an)in D(S_(Q))\left(\sigma_{\mathfrak{b}}\left(u_{p}\right) / v_{p}(\mathfrak{b})^{\mathrm{an}}\right)^{2^{\# Q}} \in D\left(S_{Q}\right), \quad \text { so } \sigma_{\mathfrak{b}}\left(u_{p}\right) / v_{p}(\mathfrak{b})^{\mathrm{an}} \in D\left(S_{Q}\right)(σb(up)/vp(b)an)2#Q∈D(SQ), so σb(up)/vp(b)an∈D(SQ)
since p 2 p ≠ 2 p!=2p \neq 2p≠2.
To conclude the proof of Theorem 7.6, one shows using the ÄŒebotarev Density Theorem that one can choose the sets Q Q QQQ to force D ( S Q ) D S Q D(S_(Q))D\left(S_{Q}\right)D(SQ) as small as desired (i.e., the intersection of D ( S Q ) D S Q D(S_(Q))D\left(S_{Q}\right)D(SQ) over all possible Q Q QQQ is trivial). See [15, LEMMA 5.17] for details.

7.4. The Greenberg-Stevens L L L\mathscr{L}L-invariant

We briefly summarize our proof of the p p ppp-part of Gross's conjecture (Theorem 4.5), which as just explained implies our explicit formula for Brumer-Stark units given in Conjecture 7.5.
The work of Greenberg and Stevens [24] was a seminal breakthrough in the study of trivial zeroes of p p ppp-adic L L LLL-functions. Their perspective was highly influential in [16], where the rank one p p ppp-adic Gross-Stark conjecture was interpreted as the equality of an algebraic L L LLL-invariant L alg L alg  L_("alg ")\mathscr{L}_{\text {alg }}Lalg  and an analytic L L LLL-invariant L an L an  L_("an ")\mathscr{L}_{\text {an }}Lan . The analytic L L L\mathscr{L}L-invariant is the ratio of the leading term of the p p ppp-adic L L LLL-function at s = 0 s = 0 s=0s=0s=0 to its classical counterpart,
(7.7) L a n = L p ( χ ω , 0 ) L ( χ , 0 ) (7.7) L a n = − L p ′ ( χ ω , 0 ) L ( χ , 0 ) {:(7.7)L_(an)=-(L_(p)^(')(chi omega,0))/(L(chi,0)):}\begin{equation*} \mathscr{L}_{\mathrm{an}}=-\frac{L_{p}^{\prime}(\chi \omega, 0)}{L(\chi, 0)} \tag{7.7} \end{equation*}(7.7)Lan=−Lp′(χω,0)L(χ,0)
The algebraic L L LLL-invariant is the ratio of the p p ppp-adic logarithm and valuation of the χ 1 χ − 1 chi^(-1)\chi^{-1}χ−1 component of the Brumer-Stark unit,
(7.8) L alg = log p Norm H β / Q p ( u p χ 1 ) ord β ( u p χ 1 ) (7.8) L alg  = log p ⁡ Norm H β / Q p ⁡ u p χ − 1 ord β ⁡ u p χ − 1 {:(7.8)L_("alg ")=(log_(p)Norm_(H_(beta)//Q_(p))(u_(p)^(chi^(-1))))/(ord_(beta)(u_(p)^(chi^(-1)))):}\begin{equation*} \mathscr{L}_{\text {alg }}=\frac{\log _{p} \operatorname{Norm}_{H_{\mathfrak{\beta}} / \mathbf{Q}_{p}}\left(u_{\mathfrak{p}}^{\chi^{-1}}\right)}{\operatorname{ord}_{\mathfrak{\beta}}\left(u_{\mathfrak{p}}^{\chi^{-1}}\right)} \tag{7.8} \end{equation*}(7.8)Lalg =logp⁡NormHβ/Qp⁡(upχ−1)ordβ⁡(upχ−1)
There is no difficulty in defining the ratios (7.7) and (7.8), since the quantities live in a p p ppp-adic field and the denominators are nonzero. The analogue of this situation for Gross's Conjecture 4.4 is more delicate. The role of the p p ppp-adic L L LLL-function is played by the Stickelberger element Θ L := Θ S , T ( L / F , 0 ) Z [ g ] Θ L := Θ S ′ , T ( L / F , 0 ) ∈ Z [ g ] Theta_(L):=Theta_(S^('),T)(L//F,0)inZ[g]\Theta_{L}:=\Theta_{S^{\prime}, T}(L / F, 0) \in \mathbf{Z}[g]ΘL:=ΘS′,T(L/F,0)∈Z[g], and the analogue of the derivative at 0 is played by the image of Θ L Θ L Theta_(L)\Theta_{L}ΘL in I / I 2 I / I 2 I//I^(2)I / I^{2}I/I2. The role of the classical L L LLL-function is played by the element Θ H := Θ S , T ( H / F , 0 ) Z [ G ] Θ H := Θ S , T ( H / F , 0 ) ∈ Z [ G ] Theta_(H):=Theta_(S,T)(H//F,0)inZ[G]\Theta_{H}:=\Theta_{S, T}(H / F, 0) \in \mathbf{Z}[G]ΘH:=ΘS,T(H/F,0)∈Z[G]. It is therefore not clear how to take the "ratio" of these quantities. Similarly, the role of the p p ppp-adic logarithm is played by rec G ( u p ) I / I 2 rec G ⁡ u p ∈ I / I 2 rec_(G)(u_(p))in I//I^(2)\operatorname{rec}_{G}\left(u_{\mathfrak{p}}\right) \in I / I^{2}recG⁡(up)∈I/I2 and the role of the p p ppp-adic valuation is played by ord G ( u p ) Z [ G ] ord G ⁡ u p ∈ Z [ G ] ord_(G)(u_(p))inZ[G]\operatorname{ord}_{G}\left(u_{\mathfrak{p}}\right) \in \mathbf{Z}[G]ordG⁡(up)∈Z[G].
For this reason, we introduce in [18] an R R RRR-algebra R L R L R_(L)R_{\mathscr{L}}RL that is generated by an element L L L\mathscr{L}L that plays the role of the analytic L L L\mathscr{L}L-invariant, i.e., the "ratio" between Θ L Θ L Theta_(L)\Theta_{L}ΘL and Θ H Θ H Theta_(H)\Theta_{H}ΘH. We define
(7.9) R L = R [ L ] / ( Θ H L Θ L , L I , L 2 , I 2 ) (7.9) R L = R [ L ] / Θ H L − Θ L , L I , L 2 , I 2 {:(7.9)R_(L)=R[L]//(Theta_(H)L-Theta_(L),LI,L^(2),I^(2)):}\begin{equation*} R_{\mathscr{L}}=R[\mathscr{L}] /\left(\Theta_{H} \mathscr{L}-\Theta_{L}, \mathscr{L} I, \mathscr{L}^{2}, I^{2}\right) \tag{7.9} \end{equation*}(7.9)RL=R[L]/(ΘHL−ΘL,LI,L2,I2)
A key nontrivial result is that this ring, in which we have adjoined a ratio L L L\mathscr{L}L between Θ L Θ L Theta_(L)\Theta_{L}ΘL and Θ H Θ H Theta_(H)\Theta_{H}ΘH, is still large enough to see R / I 2 R / I 2 R//I^(2)R / I^{2}R/I2.
Theorem 7.7 ([18, THEOREM 3.4]). The kernel of the structure map R R L R → R L R rarrR_(L)R \rightarrow R_{\mathscr{L}}R→RL is I 2 I 2 I^(2)I^{2}I2.
It follows from this theorem that Gross's Conjecture is equivalent to the equality
(7.10) rec G ( u p ) = L ord G ( u p ) in R L (7.10) rec G ⁡ u p = L ord G ⁡ u p  in  R L {:(7.10)rec_(G)(u_(p))=Lord_(G)(u_(p))quad" in "R_(L):}\begin{equation*} \operatorname{rec}_{G}\left(u_{\mathfrak{p}}\right)=\mathscr{L} \operatorname{ord}_{G}\left(u_{\mathfrak{p}}\right) \quad \text { in } R_{\mathscr{L}} \tag{7.10} \end{equation*}(7.10)recG⁡(up)=LordG⁡(up) in RL
since the right-hand side is by definition L Θ H = Θ L L Θ H = Θ L LTheta_(H)=Theta_(L)\mathscr{L} \Theta_{H}=\Theta_{L}LΘH=ΘL.
To prove (7.10), we define a generalized Ritter-Weiss module L ∇ L grad_(L)\nabla_{\mathscr{L}}∇L over the ring R L R L R_(L)R_{\mathscr{L}}RL that can be viewed as a gluing of the modules S T ( H ) ∇ S T ( H ) grad_(S)^(T)(H)\nabla_{S}^{T}(H)∇ST(H) and S T ( L ) ∇ S ′ T ( L ) grad_(S^('))^(T)(L)\nabla_{S^{\prime}}^{T}(L)∇S′T(L). We show in [18, THEOREM 4.6] that the Fitting ideal Fitt R L ( L ) R L ∇ L R_(L)(grad_(L))R_{\mathscr{L}}\left(\nabla_{\mathscr{L}}\right)RL(∇L) is generated by the element
rec G ( u p ) L ord G ( u p ) I / I 2 rec G ⁡ u p − L ord G ⁡ u p ∈ I / I 2 rec_(G)(u_(p))-Lord_(G)(u_(p))in I//I^(2)\operatorname{rec}_{G}\left(u_{\mathfrak{p}}\right)-\mathscr{L} \operatorname{ord}_{G}\left(u_{\mathfrak{p}}\right) \in I / I^{2}recG⁡(up)−LordG⁡(up)∈I/I2
and hence that (7.10) is equivalent to
(7.1) Fitt R L ( L ) = 0 (7.1) Fitt R L ⁡ ∇ L = 0 {:(7.1)Fitt_(R_(L))(grad_(L))=0:}\begin{equation*} \operatorname{Fitt}_{R_{\mathscr{L}}}\left(\nabla_{\mathscr{L}}\right)=0 \tag{7.1} \end{equation*}(7.1)FittRL⁡(∇L)=0
(For the sake of accuracy, we remark that in reality we do all of this with ( S , T ) ( S , T ) (S,T)(S, T)(S,T) replaced by the pair ( Σ , Σ ) Σ , Σ ′ (Sigma,Sigma^('))\left(\Sigma, \Sigma^{\prime}\right)(Σ,Σ′) defined in Section 5.3, as in Section 6.)
The vanishing of Fitt R L ( L ) R L ∇ L R_(L)(grad_(L))R_{\mathscr{L}}\left(\nabla_{\mathscr{L}}\right)RL(∇L) is proven following the methods of Section 6. We interpret surjective homomorphisms from L ∇ L grad_(L)\nabla_{\mathscr{L}}∇L to R L R L R_(L)R_{\mathscr{L}}RL-modules M M MMM in terms of Galois cohomology classes satisfying certain local conditions. We construct a suitable Galois cohomology class valued in a module M M MMM using an explicit construction with group-ring valued Hilbert modular forms and their associated Galois representations. The module M M MMM is shown to be large enough that its Fitting ideal over R L R L R_(L)R_{\mathscr{L}}RL vanishes, whence the same is true for L ∇ L grad_(L)\nabla_{\mathscr{L}}∇L since it has M M MMM as a quotient.

7.5. The method of Darmon-Pozzi-Vonk

We conclude by describing a proof of Conjecture 7.4 in the case that F F FFF is a real quadratic field in the beautiful work of Darmon, Pozzi, and Vonk [14]. Their method is purely p p ppp-adic (i.e., "vertical"), rather than involving the introduction of auxiliary primes (i.e., "horizontal"). The strategy follows a rich history of arithmetic formulas proven by exhibiting both sides of an equation as certain Fourier coefficients in an equality of modular forms. For instance, Katz gave an elegant proof of Leopoldt's evaluation of the Kubota-Leopoldt p p ppp-adic L L LLL-function at s = 1 s = 1 s=1s=1s=1 by exhibiting an equality of p p ppp-adic modular forms, one of whose constant terms is the p p ppp-adic L L LLL-value and the other is the p p ppp-adic logarithm of a unit (see [33, $10.2]). The proof of Darmon-Pozzi-Vonk follows a similar strategy.
Let F F FFF be a real quadratic field, p p ppp an odd prime, and H H HHH a narrow ring class field extension of F F FFF (so, in particular, p O F p O F pO_(F)p O_{F}pOF splits completely in H H HHH ). Darmon-Pozzi-Vonk demonstrate an equality of certain classical modular forms of weight 2 on Γ 0 ( p ) S L 2 ( Z ) Γ 0 ( p ) ⊂ S L 2 ( Z ) Gamma_(0)(p)subSL_(2)(Z)\Gamma_{0}(p) \subset \mathrm{SL}_{2}(\mathbf{Z})Γ0(p)⊂SL2(Z) that we denote f 1 f 1 f_(1)f_{1}f1 and f 2 f 2 f_(2)f_{2}f2.
This first of these forms f 1 f 1 f_(1)f_{1}f1 is obtained by considering a Hida family of Hilbert modular cusp forms for F F FFF specializing in weight 1 to a p p ppp-stabilized Eisenstein series. The constant term of this weight 1 Eisenstein series vanishes because of the trivial zero of the corresponding p p ppp-adic L L LLL-function. Pozzi has described explicitly the Fourier coefficients of the derivative of this family with respect to the weight variables [39]. The key idea of DarmonPozzi-Vonk is to restrict the derivative in the antiparallel direction along the diagonal and take the ordinary projection to obtain a classical modular form of weight 2 for Γ 0 ( p ) Γ 0 ( p ) Gamma_(0)(p)\Gamma_{0}(p)Γ0(p). The idea of taking the derivative of a family of modular forms at a point of vanishing and applying a "holomorphic projection" operator has its roots in the seminal work of Gross-Zagier [30], and appears more recently in Kudla's program for incoherent Eisenstein series [34].
Pozzi's work relates the p p ppp th Fourier coefficient of this diagonal restriction to the p p ppp adic logarithm of the Brumer-Stark unit σ b ( u p ) σ b u p sigma_(b)(u_(p))\sigma_{\mathfrak{b}}\left(u_{p}\right)σb(up) for the extension H H HHH. To obtain the desired weight 2 form f 1 f 1 f_(1)f_{1}f1 on Γ 0 ( p ) Γ 0 ( p ) Gamma_(0)(p)\Gamma_{0}(p)Γ0(p), one must take a certain linear combination with the diagonal restrictions of the two ordinary families of Eisenstein series passing through this weight 1 point.
The second form f 2 f 2 f_(2)f_{2}f2 is defined as a generating series attached to a certain rigid analytic theta cocycle. These are classes in H 1 ( S L 2 ( Z [ 1 / p ] ) , A / C p ) H 1 S L 2 ( Z [ 1 / p ] ) , A ∗ / C p ∗ H^(1)(SL_(2)(Z[1//p]),A^(**)//C_(p)^(**))H^{1}\left(\mathrm{SL}_{2}(\mathbf{Z}[1 / p]), \mathcal{A}^{*} / \mathbf{C}_{p}^{*}\right)H1(SL2(Z[1/p]),A∗/Cp∗), where A A ∗ A^(**)\mathcal{A}^{*}A∗ denotes the group of rigid analytic nonvanishing functions on the p p ppp-adic upper half plane. DarmonPozzi-Vonk construct classes in this space explicitly, and study their image under the logarithmic annular residue map
H 1 ( S L 2 ( Z [ 1 / p ] ) , A / C p ) H 1 ( Γ 0 ( p ) , Z p ) H 1 S L 2 ( Z [ 1 / p ] ) , A ∗ / C p ∗ → H 1 Γ 0 ( p ) , Z p H^(1)(SL_(2)(Z[1//p]),A^(**)//C_(p)^(**))rarrH^(1)(Gamma_(0)(p),Z_(p))H^{1}\left(\mathrm{SL}_{2}(\mathbf{Z}[1 / p]), \mathscr{A}^{*} / \mathbf{C}_{p}^{*}\right) \rightarrow H^{1}\left(\Gamma_{0}(p), \mathbf{Z}_{p}\right)H1(SL2(Z[1/p]),A∗/Cp∗)→H1(Γ0(p),Zp)
They compute the spectral expansion of the form f 2 f 2 f_(2)f_{2}f2 and thereby show that its nonconstant Fourier coefficients are equal to those of f 1 f 1 f_(1)f_{1}f1. Meanwhile, the constant coefficient is equal to the p p ppp-adic logarithm of u p ( b ) a n u p ( b ) a n u_(p)(b)^(an)u_{p}(\mathfrak{b})^{\mathrm{an}}up(b)an. The equality of the non-constant coefficients implies that f 1 = f 2 f 1 = f 2 f_(1)=f_(2)f_{1}=f_{2}f1=f2, and hence that the constant coefficients are equal as well, i.e.,
log p ( σ b ( u p ) ) = log p ( u p ( b ) a n ) log p ⁡ σ b u p = log p ⁡ u p ( b ) a n log_(p)(sigma_(b)(u_(p)))=log_(p)(u_(p)(b)^(an))\log _{p}\left(\sigma_{\mathfrak{b}}\left(u_{p}\right)\right)=\log _{p}\left(u_{p}(\mathfrak{b})^{\mathrm{an}}\right)logp⁡(σb(up))=logp⁡(up(b)an)
as desired. It is a tantalizing problem to generalize this strategy to arbitrary totally real fields.

ACKNOWLEDGMENTS

We would like to thank the many mathematicians whose work has been highly influential in the development of the perspective that we have described here. Our work is the continuation of a long line of research connecting L L LLL-functions, modular forms, Galois representations, and Fitting ideals of class groups. In particular, we would like to thank Armand Brumer, David Burns, John Coates, Pierre Charollois, Pierre Colmez, Henri Darmon, Cornelius Greither, Benedict Gross, Masato Kurihara, Cristian Popescu, Alice Pozzi, Kenneth Ribet, Jurgen Ritter, Karl Rubin, Takamichi Sano, Michael Spiess, John Tate, Jan Vonk, Alfred Weiss, and Andrew Wiles.

FUNDING

The first author is supported by a grant from the National Science Foundation, DMS1901939. The second author is supported by DST-SERB grant SB/SJF/2020-21/11, SERB SUPRA grant SPR/2019/000422 and SERB MATRICS grant MTR/2020/000215.

REFERENCES

[1] M. Atsuta and T. Kataoka, Fitting ideals of class groups for C M C M CM\mathrm{CM}CM abelian extensions. 2021, arXiv:2104.14765.
[2] A. Beilinson, G. Kings, and A. Levin, Topological polylogarithms and p p ppp-adic interpolation of L L LLL-values of totally real fields. Math. Ann. 371 (2018), no. 3-4, 1449 1495 1449 − 1495 1449-14951449-14951449−1495.
[3] N. Bergeron, P. Charollois, and L. Garcia, Transgressions of the Euler class and Eisenstein cohomology of G L N ( Z ) G L N ( Z ) GL_(N)(Z)\mathrm{GL}_{N}(\mathbf{Z})GLN(Z). Jpn. J. Math. 15 (2020), no. 2, 311-379.
[4] D. Burns, On derivatives of Artin L L LLL-series. Invent. Math. 186 (2011), no. 2, 291-371.
[5] D. Burns, On derivatives of p p ppp-adic L L LLL-series at s = 0 s = 0 s=0s=0s=0. J. Reine Angew. Math. 762 (2020), 53-104.
[6] D. Burns, M. Kurihara, and T. Sano, On zeta elements for G m G m G_(m)\mathbb{G}_{m}Gm. Doc. Math. 21, (2016), 555-626.
[7] D. Burns and T. Sano, On the theory of higher rank Euler, Kolyvagin and Stark systems. Int. Math. Res. Not. 2021 (2021), no. 13, 10118-10206.
[8] P. Cassou-Noguès, Valeurs aux entiers négatifs des fonctions zêta et fonctions zêta p-adiques. Invent. Math. 51 (1979), no. 1, 29-59.
[9] P. Charollois and S. Dasgupta, Integral Eisenstein cocycles on G L n G L n GL_(n)\mathbf{G L}_{n}GLn, I: Sczech's cocycle and p p ppp-adic L L LLL-functions of totally real fields. Cambridge J. Math. 2 (2014), no. 1, 49-90.
[10] P. Charollois, S. Dasgupta, and M. Greenberg, Integral Eisenstein cocycles on G L n G L n GL_(n)\mathrm{GL}_{n}GLn, II: Shintani's method. Comment. Math. Helv. 90 (2015), no. 2, 435-477.
[11] P. Colmez, Résidu en s = 1 s = 1 s=1s=1s=1 des fonctions zêta p p ppp-adiques. Invent. Math. 91 (1988), no. 2, 371-389.
[12] H. Darmon, Integration on H p × H H p × H H_(p)xxH\mathscr{H}_{p} \times \mathscr{H}Hp×H and arithmetic applications. Ann. of Math. 154 (2001), no. 3, 589-639.
[13] H. Darmon and S. Dasgupta, Elliptic units for real quadratic fields. Ann. of Math. 163 (2006), no. 1, 301-346.
[14] H. Darmon, A. Pozzi, and J. Vonk, The values of the Dedekind-Rademacher cocycle at real multiplication points. 2021, arXiv:2103.02490.
[15] S. Dasgupta, Shintani zeta functions and Gross-Stark units for totally real fields. Duke Math. J. 143 (2008), no. 2, 225-279.
[16] S. Dasgupta, H. Darmon, and R. Pollack, Hilbert modular forms and the GrossStark conjecture. Ann. of Math. 174 (2011), no. 1, 439-484.
[17] S. Dasgupta and M. Kakde, On the Brumer-Stark conjecture. 2020, arXiv:2010.00657.
[18] S. Dasgupta and M. Kakde, Brumer-Stark units and Hilbert's 12th Problem. 2021, arXiv:2103.02516.
[19] S. Dasgupta and M. Kakde, On constant terms of Eisenstein series. Acta Arith. 200 (2021), no. 2, 119-147.
[20] S. Dasgupta and M. Spiess, Partial zeta values, Gross's tower of fields conjecture, and Gross-Stark units. J. Eur. Math. Soc. (JEMS) 20 (2018), no. 11, 2643-2683.
[21] P. Deligne and K. Ribet, Values of abelian L L LLL-functions at negative integers over totally real fields. Invent. Math. 59 (1980), no. 3, 227-286.
[22] F. Diaz, y. Diaz, and E. Friedman, Signed fundamental domains for totally real number fields. Proc. Lond. Math. Soc. 108 (2014), no. 4, 965-988.
[23] M. Fleischer and Y. Liu, Computations of elliptic units. 2021, https://github.com/ ∼ ∼\sim∼ liuyj8526/Computation-of-Elliptic-Units.
[24] R. Greenberg and G. Stevens, p p ppp-adic L L LLL-functions and p p ppp-adic periods of modular forms. Invent. Math. 111 (1993), no. 2, 407-447.
[25] C. Greither, Determining Fitting ideals of minus class groups via the equivariant Tamagawa number conjecture. Compos. Math. 143 (2007), no. 6, 1399-1426.
[26] C. Greither and M. Kurihara, Stickelberger elements, Fitting ideals of class groups of CM-fields, and dualisation. Math. Z. 260 (2008), no. 4, 905-930.
[27] C. Greither and C. Popescu, An equivariant main conjecture in Iwasawa theory and applications. J. Algebraic Geom. 24 (2015), no. 4, 629-692.
[28] B. Gross, p-adic L L LLL-series at s = 0 s = 0 s=0s=0s=0. J. Fac. Sci., Univ. Tokyo, Sect. 1A, Math. 28 (1981), no. 3, 979-994.
[29] B. Gross, On the values of abelian L L LLL-functions at s = 0 s = 0 s=0s=0s=0. J. Fac. Sci., Univ. Tokyo, Sect. 1A, Math. 35 (1988), no. 1, 177-197.
[30] B. Gross and D. Zagier, Heegner points and derivatives of L L LLL-series. Invent. Math. 84 (1986), no. 2, 225-320.
[31] U. Jannsen, Iwasawa modules up to isomorphism. In Algebraic number theory, edited by J. Coates, R. Greenberg, B. Mazur, and I. Satake, pp. 171-207, Adv. Stud. Pure Math. 17, Academic Press, Boston, MA, 1989.
[32] T. Kataoka, Fitting invariants in equivariant Iwasawa theory. In Development of Iwasawa theory - the centennial of K. Iwasawa's birth, edited by M. Kurihara, K. Bannai, T. Ochiai, and T. Tsuji, pp. 413-465, Adv. Stud. Pure Math. 86, Mathematical Society of Japan, Tokyo, 2020.
[33] N. Katz, p-adic interpolation of real analytic Eisenstein series. Ann. of Math. 104 (1976), no. 3, 459-571.
[34] S. Kudla, Central derivatives of Eisenstein series and height pairings. Ann. of Math. 146 (1997), no. 3, 545-646.
[35] M. Kurihara, Notes on the dual of the ideal class groups of CM-fields. J. Théor. Nombres Bordeaux (to appear).
[36] B. Mazur, How can we construct abelian Galois extensions of basic number fields? Bull. Amer. Math. Soc. (N.S.) 48 (2011), no. 2, 155-209.
[37] D. Northcott, Finite free resolutions. Cambridge Univ. Press, Cambridge-New York, 1976 .
[38] C. Popescu, Stark's question and a refinement of Brumer's conjecture extrapolated to the function field case. Compos. Math. 140 (2004), no. 3, 631-646.
[39] A. Pozzi, The eigencurve at weight one Eisenstein points. Ph.D. thesis, McGill University, Montreal, 2018.
[40] J. Ritter and A. Weiss, A Tate sequence for global units. Compos. Math. 102 (1996), no. 2, 147-178.
[41] K. Rubin, A Stark conjecture "over Z Z Z\mathbf{Z}Z " for abelian L L LLL-functions with multiple zeros. Ann. Inst. Fourier (Grenoble) 46 (1996), no. 1, 33-62.
[42] J-P. Serre, Corps locaux, Publications de l'Institut de Mathématique de l'Université de Nancago, VIII. Actual. Sci. Ind. 1296, Hermann, Paris, 1962.
[43] J. Silliman, Group ring valued Hilbert modular forms. 2020, arXiv:2009.14353.
[44] W. Sinnot, On the Stickelberger ideal and the circular units of an abelian field. Invent. Math. 62 (1980), no. 2, 181-234.
[45] M. Spiess, Shintani cocycles and the order of vanishing of p p ppp-adic Hecke L L LLL-series at s = 0 s = 0 s=0s=0s=0. Math. Ann. 359 (2014), no. 1-2, 239-265.
[46] J. Tate, On Stark's conjectures on the behavior of L ( s , χ ) L ( s , χ ) L(s,chi)L(s, \chi)L(s,χ) at s = 0 s = 0 s=0s=0s=0. J. Fac. Sci., Univ. Tokyo, Sect. 1A, Math. 28 (1981), no. 3, 963-978.

SAMIT DASGUPTA

Duke University, Department of Mathematics, Campus Box 90320, Durham, NC 277080320, USA, dasgupta@ math.duke.edu

MAHESH KAKDE

Department of Mathematics, Indian Institute of Science, Bangalore 560012, India, maheshkakde@iisc.ac.in

ARITHMETIC AND DYNAMICS ON VARIETIES OF MARKOFF TYPE

ALEXANDER GAMBURD

ABSTRACT

The Markoff equation x 2 + y 2 + z 2 = 3 x y z x 2 + y 2 + z 2 = 3 x y z x^(2)+y^(2)+z^(2)=3xyzx^{2}+y^{2}+z^{2}=3 x y zx2+y2+z2=3xyz, which arose in his spectacular thesis in 1879 , is ubiquitous in a tremendous variety of contexts. After reviewing some of these, we discuss Hasse principle, asymptotics of integer points, and, in particular, recent progress towards establishing forms of strong approximation on varieties of Markoff type, as well as ensuing implications, diophantine and dynamical.

MATHEMATICS SUBJECT CLASSIFICATION 2020

Primary 14G12; Secondary 11N36, 11D45, 37P55

KEYWORDS

Markoff triples, strong approximation, nonlinear dynamics
Important though the general concepts and propositions may be with which the modern industrious passion for axiomatizing and generalizing has presented us, in algebra perhaps more than anywhere else, nevertheless I am convinced that the special problems in all their complexity constitute the stock and core of mathematics; and to master their difficulties requires on the whole the harder labor.
Hermann Weyl, The Classical Groups, 1939

1. INTRODUCTION

1.1. Andrei Andreevich Markov is one of the towering peaks of the illustrious Saint Petersburg school of number theory, alongside with Chebyshev and Linnik. A singular characteristic of this school is a deep, often subterranean, interaction between arithmetic/combinatorics and probability/dynamics. While Markov is perhaps most widely known today for the chains named after him, it is in the context of his arguably deepest work on the minima of binary quadratic forms and badly approximable numbers 1 1 ^(1){ }^{1}1 that the following equation, now bearing his name, was born:
(1.1) x 1 2 + x 2 2 + x 3 2 = 3 x 1 x 2 x 3 (1.1) x 1 2 + x 2 2 + x 3 2 = 3 x 1 x 2 x 3 {:(1.1)x_(1)^(2)+x_(2)^(2)+x_(3)^(2)=3x_(1)x_(2)x_(3):}\begin{equation*} x_{1}^{2}+x_{2}^{2}+x_{3}^{2}=3 x_{1} x_{2} x_{3} \tag{1.1} \end{equation*}(1.1)x12+x22+x32=3x1x2x3
describing a Markoff surface X A 3 X ⊂ A 3 X subA^(3)X \subset \mathbb{A}^{3}X⊂A3. Markoff triples M M M\mathcal{M}M are the solutions of (1.1) with positive integral coordinates. Markoff numbers M N M ⊂ N MsubN\mathbb{M} \subset \mathbb{N}M⊂N are obtained as coordinates of elements of M M M\mathcal{M}M. The Markoff sequence M s M s M^(s)\mathbb{M}^{s}Ms is the set of largest coordinates of an m M m ∈ M m inMm \in \mathcal{M}m∈M counted with multiplicity; the uniqueness conjecture of Frobenius [62] asserts that M = M s M = M s M=M^(s)\mathbb{M}=\mathbb{M}^{s}M=Ms.
All elements of M M M\mathcal{M}M are gotten from the root solution r = ( 1 , 1 , 1 ) r = ( 1 , 1 , 1 ) r=(1,1,1)r=(1,1,1)r=(1,1,1) by repeated applications of an element in a set S S SSS, consisting of σ Σ 3 σ ∈ Σ 3 sigma inSigma_(3)\sigma \in \Sigma_{3}σ∈Σ3, the permutations of the coordinates of ( x 1 , x 2 , x 3 ) x 1 , x 2 , x 3 (x_(1),x_(2),x_(3))\left(x_{1}, x_{2}, x_{3}\right)(x1,x2,x3), and of the Vieta involutions R 1 , R 2 , R 3 R 1 , R 2 , R 3 R_(1),R_(2),R_(3)R_{1}, R_{2}, R_{3}R1,R2,R3 of A 3 A 3 A^(3)\mathbb{A}^{3}A3, with R 1 ( x 1 , x 2 , x 3 ) = ( 3 x 2 x 3 R 1 x 1 , x 2 , x 3 = 3 x 2 x 3 − R_(1)(x_(1),x_(2),x_(3))=(3x_(2)x_(3)-:}R_{1}\left(x_{1}, x_{2}, x_{3}\right)=\left(3 x_{2} x_{3}-\right.R1(x1,x2,x3)=(3x2x3− x 1 , x 2 , x 3 x 1 , x 2 , x 3 x_(1),x_(2),x_(3)x_{1}, x_{2}, x_{3}x1,x2,x3 ) and R 2 , R 3 R 2 , R 3 R_(2),R_(3)R_{2}, R_{3}R2,R3 defined similarly. Denoting by Γ Î“ Gamma\GammaΓ the nonlinear group of affine morphisms of A 3 A 3 A^(3)\mathbb{A}^{3}A3 generated by S S SSS, the set of Markoff triples M M M\mathcal{M}M can be identified with the orbit of the root r r rrr under the action of Γ Î“ Gamma\GammaΓ, that is to say, M = Γ r M = Γ â‹… r M=Gamma*r\mathcal{M}=\Gamma \cdot rM=Γ⋅r, giving rise to the Markoff tree [8]:
( 13 , 1 , 34 ) < ( 34 , 1 , 89 ) < ( 13 , 34 , 1325 ) < ( 1 , 1 , 1 ) ( 1 , 1 , 2 ) ( 2 , 1 , 5 ) ( 5 , 13 , 194 ) < ( 194 , 13 , 7561 ) < ( 5 , 194 , 2897 ) < ( 433 , 5 , 6466 ) < ( 29 , 5 , 433 ) < ( 29 , 433 , 37666 ) < ( 169 , 29 , 14701 ) < ( 2 , 169 , 985 ) < ( 13 , 1 , 34 ) < ( 34 , 1 , 89 ) < ⋯ ( 13 , 34 , 1325 ) < ⋯ ( 1 , 1 , 1 ) − ( 1 , 1 , 2 ) − ( 2 , 1 , 5 ) ⟨ ( 5 , 13 , 194 ) < ( 194 , 13 , 7561 ) < … ( 5 , 194 , 2897 ) < … ( 433 , 5 , 6466 ) < … ( 29 , 5 , 433 ) < ( 29 , 433 , 37666 ) < … ( 169 , 29 , 14701 ) < … ( 2 , 169 , 985 ) < … {:[(13","1","34) < {:[(34","1","89) < cdots],[(13","34","1325) < cdots]:}],[(1","1","1)-(1","1","2)-(2","1","5)(:],[{:[(5","13","194) < {:[(194","13","7561) < dots],[(5","194","2897) < dots]:}],[(433","5","6466) < dots],[(29","5","433) < {:(29","433","37666) < dots:}],[(169","29","14701) < dots],[(2","169","985) < dots]:}]:}\begin{aligned} & (13,1,34)<\begin{array}{l} (34,1,89)<\cdots \\ (13,34,1325)<\cdots \end{array} \\ & (1,1,1)-(1,1,2)-(2,1,5)\langle \\ & \begin{aligned} &(5,13,194)< \begin{array}{l} (194,13,7561)<\ldots \\ (5,194,2897)<\ldots \end{array} \\ &(433,5,6466)<\ldots \\ &(29,5,433)<\begin{array}{l} (29,433,37666)<\ldots \end{array} \\ &(169,29,14701)<\ldots \\ &(2,169,985)<\ldots \end{aligned} \end{aligned}(13,1,34)<(34,1,89)<⋯(13,34,1325)<⋯(1,1,1)−(1,1,2)−(2,1,5)⟨(5,13,194)<(194,13,7561)<…(5,194,2897)<…(433,5,6466)<…(29,5,433)<(29,433,37666)<…(169,29,14701)<…(2,169,985)<…
1 This work of Markoff and some of the subsequent appearances of his equation in a tremendous variety of different contexts are briefly discussed in Section 2.
The first few members of M M M\mathbb{M}M are
1 , 2 , 5 , 13 , 29 , 34 , 89 , 169 , 194 , 233 , 433 , 610 , 985 , 1 , 2 , 5 , 13 , 29 , 34 , 89 , 169 , 194 , 233 , 433 , 610 , 985 , … 1,2,5,13,29,34,89,169,194,233,433,610,985,dots1,2,5,13,29,34,89,169,194,233,433,610,985, \ldots1,2,5,13,29,34,89,169,194,233,433,610,985,…
The sequence M s M s M^(s)\mathbb{M}^{s}Ms is sparse, as shown by Zagier [147]:
(1.2) m M s m T 1 c ( log T ) 2 as T ( c > 0 ) (1.2) ∑ m ∈ M s m ≤ T   1 ∼ c ( log ⁡ T ) 2  as  T → ∞ ( c > 0 ) {:(1.2)sum_({:[m inM^(s)],[m <= T]:})1∼c(log T)^(2)quad" as "T rarr oo(c > 0):}\begin{equation*} \sum_{\substack{m \in \mathbb{M}^{s} \\ m \leq T}} 1 \sim c(\log T)^{2} \quad \text { as } T \rightarrow \infty(c>0) \tag{1.2} \end{equation*}(1.2)∑m∈Msm≤T1∼c(log⁡T)2 as T→∞(c>0)
1.2. The origins of investigations which underlie "the stock and core" of this report date back to August of 2005 and involve a "special problem" pertaining to Markoff numbers; here is Peter Sarnak's recollection [126]: "For me the starting point of this investigation was in 2005 when Michel and Venkatesh asked me about the existence of poorly distributed closed geodesics on the modular surface. It was clear that Markov's constructions of his geodesics using his Markov equation provided what they wanted but they preferred quadratic forms with square free discriminant. This raised the question of sieving in this context of an orbit of a group of (nonlinear) morphisms of affine space. The kind of issues that one quickly faces in attempting to execute such a sieve are questions of the image of the orbit when reduced mod q mod q mod q\bmod qmodq and interestingly whether certain graphs associated with these orbits are expander families. 2 2 ^(2){ }^{2}2 Gamburd in his thesis had established the expander property in some simpler but similar settings and he and I began a lengthy investigation into this sieving problem in the simpler setting when the group of affine morphisms acts linearly (or what we call now the affine linear sieve)."
The question posed by Michel and Venkatesh arose in the course of their joint work with Einsiedler and Lindenstrauss [58,59] on generalizations of Duke's theorem [57]; formulated in terms of Markoff numbers, it leads to the following:
Conjecture 1. There are infinitely many square-free Markoff numbers.
As detailed in [21], an application of sieve methods in the setting of affine orbits leads to and demands an affirmative answer to the question as to whether Markoff graphs, obtained as a modular reduction of the Markoff tree, 3 3 ^(3){ }^{3}3 form a family of expanders. Numerical experiments by de Courcy-Ireland and Lee [55], as well as results detailed in Section 2.5, are compelling in favor of the following superstrong approximation conjecture for Markoff graphs:
Conjecture 2. The family of Markoff graphs X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p) forms a family of expanders.
Before attacking this conjecture, asserting high connectivity of Markoff graphs, one has to confront the question of their connectivity, that is to say, the issue of the strong approximation for Markoff graphs:

FIGURE 1

Markoff graph mod 7. In [54] it is proved that the Markoff graphs are not planar for primes greater than 7.
Conjecture 3. The map π p : M X ( p ) Ï€ p : M → X ∗ ( p ) pi_(p):MrarrX^(**)(p)\pi_{p}: \mathcal{M} \rightarrow X^{*}(p)Ï€p:M→X∗(p) is onto, that is to say, Markoff graphs X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p) are connected.
While Conjectures 1 and 2 have withstood our protracted attack over the past 17 years, much progress has been made on parallel questions in the case of affine linear maps. We will return to the recent resolution of Conjecture 3, and resulting progress on diophantine properties of Markoff numbers in Section 1.5.
1.3. Before describing the general setting of Affine Linear Sieve, it is instructive to briefly examine an example which is in many ways parallel to the Markoff situation, namely integral Apollonian packings [63,127]. A theorem of Descartes asserts that x 1 , x 2 , x 3 , x 4 R 4 x 1 , x 2 , x 3 , x 4 ∈ R 4 x_(1),x_(2),x_(3),x_(4)inR^(4)x_{1}, x_{2}, x_{3}, x_{4} \in \mathbb{R}^{4}x1,x2,x3,x4∈R4 are the curvatures of four mutually tangent circles in the plane if
(1.3) 2 ( x 1 2 + x 2 2 + x 3 2 + x 4 2 ) = ( x 1 + x 2 + x 3 + x 4 ) 2 (1.3) 2 x 1 2 + x 2 2 + x 3 2 + x 4 2 = x 1 + x 2 + x 3 + x 4 2 {:(1.3)2(x_(1)^(2)+x_(2)^(2)+x_(3)^(2)+x_(4)^(2))=(x_(1)+x_(2)+x_(3)+x_(4))^(2):}\begin{equation*} 2\left(x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{4}^{2}\right)=\left(x_{1}+x_{2}+x_{3}+x_{4}\right)^{2} \tag{1.3} \end{equation*}(1.3)2(x12+x22+x32+x42)=(x1+x2+x3+x4)2
Given an initial configuration of 4 such circles, we fill in repeatedly the lune regions with the unique circle which is tangent to 3 sides (which is possible by a theorem of Apollonius). In this way we get a packing of the outside circle by circles giving an Apollonian packing. The interesting diophantine feature is that if the initial curvatures are integral then so are the curvatures of the entire packing.
The numbers in the circles in Figure 2 indicate their curvatures; note that by convention the outer circle has negative curvature. Viewing equation (1.3) as a quadratic equation in x 1 x 1 x_(1)x_{1}x1, we see that the two solutions are related as x 1 + x 1 = 2 x 2 + 2 x 3 + 2 x 4 x 1 + x 1 ′ = 2 x 2 + 2 x 3 + 2 x 4 x_(1)+x_(1)^(')=2x_(2)+2x_(3)+2x_(4)x_{1}+x_{1}^{\prime}=2 x_{2}+2 x_{3}+2 x_{4}x1+x1′=2x2+2x3+2x4, the crucial point being that the Vieta involutions in this case are given by linear maps A 1 , A 2 , A 3 , A 4 A 1 , A 2 , A 3 , A 4 A_(1),A_(2),A_(3),A_(4)A_{1}, A_{2}, A_{3}, A_{4}A1,A2,A3,A4 where A j ( e k ) = 3 e k + 2 ( e 1 + e 2 + e 3 + e 4 ) A j e k = − 3 e k + 2 e 1 + e 2 + e 3 + e 4 A_(j)(e_(k))=-3e_(k)+2(e_(1)+e_(2)+e_(3)+e_(4))A_{j}\left(e_{k}\right)=-3 e_{k}+2\left(e_{1}+e_{2}+e_{3}+e_{4}\right)Aj(ek)=−3ek+2(e1+e2+e3+e4) if k = j k = j k=jk=jk=j and A j ( e k ) = e k A j e k = e k A_(j)(e_(k))=e_(k)A_{j}\left(e_{k}\right)=e_{k}Aj(ek)=ek if k j k ≠ j k!=jk \neq jk≠j ( e 1 , e 2 , e 3 , e 4 e 1 , e 2 , e 3 , e 4 (e_(1),e_(2),e_(3),e_(4):}\left(e_{1}, e_{2}, e_{3}, e_{4}\right.(e1,e2,e3,e4 are the standard basis vectors). The configurations of 4 mutually tangent circles in the packing with initial configuration a = ( a 1 , a 2 , a 3 , a 4 ) a = a 1 , a 2 , a 3 , a 4 a=(a_(1),a_(2),a_(3),a_(4))a=\left(a_{1}, a_{2}, a_{3}, a_{4}\right)a=(a1,a2,a3,a4) consist of points x x xxx in the orbit O = Λ a O = Λ â‹… a O=Lambda*a\mathcal{O}=\Lambda \cdot aO=Λ⋅a where Λ = A 1 , A 2 , A 3 , A 4 Λ = A 1 , A 2 , A 3 , A 4 Lambda=(:A_(1),A_(2),A_(3),A_(4):)\Lambda=\left\langle A_{1}, A_{2}, A_{3}, A_{4}\right\rangleΛ=⟨A1,A2,A3,A4⟩ is the Apollonian group. The elements A j A j A_(j)A_{j}Aj

FIGURE 2

Integral Apollonian packing ( 11 , 21 , 24 , 28 ) ( − 11 , 21 , 24 , 28 ) (-11,21,24,28)(-11,21,24,28)(−11,21,24,28).
preserve F F FFF given by
F ( x 1 , x 2 , x 3 , x 4 ) = 2 ( x 1 2 + x 2 2 + x 3 2 + x 4 2 ) ( x 1 + x 2 + x 3 , + x 4 ) 2 F x 1 , x 2 , x 3 , x 4 = 2 x 1 2 + x 2 2 + x 3 2 + x 4 2 − x 1 + x 2 + x 3 , + x 4 2 F(x_(1),x_(2),x_(3),x_(4))=2(x_(1)^(2)+x_(2)^(2)+x_(3)^(2)+x_(4)^(2))-(x_(1)+x_(2)+x_(3),+x_(4))^(2)F\left(x_{1}, x_{2}, x_{3}, x_{4}\right)=2\left(x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{4}^{2}\right)-\left(x_{1}+x_{2}+x_{3},+x_{4}\right)^{2}F(x1,x2,x3,x4)=2(x12+x22+x32+x42)−(x1+x2+x3,+x4)2
and hence Λ O F ( Z ) Λ ≤ O F ( Z ) Lambda <= O_(F)(Z)\Lambda \leq O_{F}(\mathbb{Z})Λ≤OF(Z). The group Λ Î› Lambda\LambdaΛ is Zariski dense in O F O F O_(F)O_{F}OF, but it is thin in O F ( Z ) O F ( Z ) O_(F)(Z)O_{F}(\mathbb{Z})OF(Z). For example, | { γ O F ( Z ) : γ T } | c 1 T 2 γ ∈ O F ( Z ) : ∥ γ ∥ ≤ T ∼ c 1 T 2 |{gamma inO_(F)(Z):||gamma|| <= T}|∼c_(1)T^(2)\left|\left\{\gamma \in O_{F}(\mathbb{Z}):\|\gamma\| \leq T\right\}\right| \sim c_{1} T^{2}|{γ∈OF(Z):∥γ∥≤T}|∼c1T2 as T T → ∞ T rarr ooT \rightarrow \inftyT→∞, while | { γ Λ : γ T } | c 1 T δ | { γ ∈ Λ : ∥ γ ∥ ≤ T } | ∼ c 1 T δ |{gamma in Lambda:||gamma|| <= T}|∼c_(1)T^(delta)|\{\gamma \in \Lambda:\|\gamma\| \leq T\}| \sim c_{1} T^{\delta}|{γ∈Λ:∥γ∥≤T}|∼c1Tδ, where 4 δ = 1.3 4 δ = 1.3 … ^(4)delta=1.3 dots{ }^{4} \delta=1.3 \ldots4δ=1.3… is the Hausdorff dimension of the limit set of Λ Î› Lambda\LambdaΛ.
The general setting of Affine Linear Sieve, introduced in [20,21], is as follows. For j = 1 , 2 , , k j = 1 , 2 , … , k j=1,2,dots,kj=1,2, \ldots, kj=1,2,…,k, let A j A j A_(j)A_{j}Aj be invertible integer coefficient polynomial maps from Z n Z n Z^(n)\mathbb{Z}^{n}Zn to Z n Z n Z^(n)\mathbb{Z}^{n}Zn (here n 1 n ≥ 1 n >= 1n \geq 1n≥1 and the inverses of A j A j A_(j)A_{j}Aj 's are assumed to be of the same type). Let Λ Î› Lambda\LambdaΛ be the group generated by A 1 , , A k A 1 , … , A k A_(1),dots,A_(k)A_{1}, \ldots, A_{k}A1,…,Ak and let O = Λ b O = Λ b O=Lambda b\mathcal{O}=\Lambda bO=Λb be the orbit of some b Z n b ∈ Z n b inZ^(n)b \in \mathbb{Z}^{n}b∈Zn under Λ Î› Lambda\LambdaΛ. Given a polynomial f Q [ x 1 , , x n ] f ∈ Q x 1 , … , x n f in Q[x_(1),dots,x_(n)]f \in Q\left[x_{1}, \ldots, x_{n}\right]f∈Q[x1,…,xn] which is integral on O O O\mathcal{O}O, the aim is to show that there are many points x O x ∈ O x inOx \in \mathcal{O}x∈O at which f ( x ) f ( x ) f(x)f(x)f(x) has few or even the least possible number of prime factors, in particular that such points are Zariski dense in the Zariski closure, Zcl ( O ) Zcl ⁡ ( O ) Zcl(O)\operatorname{Zcl}(\mathcal{O})Zcl⁡(O) of O O O\mathcal{O}O. Let O ( f , r ) O ( f , r ) O(f,r)\mathcal{O}(f, r)O(f,r) denote the set of x O x ∈ O x inOx \in \mathcal{O}x∈O for which f ( x ) f ( x ) f(x)f(x)f(x) has at most r r rrr prime factors. As r r → ∞ r rarr oor \rightarrow \inftyr→∞, the sets O ( f , r ) O ( f , r ) O(f,r)\mathcal{O}(f, r)O(f,r) increase and potentially at some point become Zariski dense. Define the saturation number r 0 ( O , f ) r 0 ( O , f ) r_(0)(O,f)r_{0}(\mathcal{O}, f)r0(O,f) to be the least integer r r rrr such that Zcl ( O ( f , r ) ) = Zcl ( O ) Zcl ⁡ ( O ( f , r ) ) = Zcl ⁡ ( O ) Zcl(O(f,r))=Zcl(O)\operatorname{Zcl}(\mathcal{O}(f, r))=\operatorname{Zcl}(\mathcal{O})Zcl⁡(O(f,r))=Zcl⁡(O). It is by no means obvious that it is finite or even if one should expect it to be so, in general. If it is finite, we say that the pair ( O , f ) ( O , f ) (O,f)(\mathcal{O}, f)(O,f) saturates. In the case of linear maps, the theory by now is quite advanced and the basic result pertaining to the finiteness of the saturation number in all cases where it is expected to hold, namely in the case of the Levi factor of G = Zcl ( Λ ) G = Zcl ⁡ ( Λ ) G=Zcl(Lambda)G=\operatorname{Zcl}(\Lambda)G=Zcl⁡(Λ) being semisimple, 5 has been established [123]. Both strong and superstrong approximation, particularly for thin
4 This result can be deduced from the work of Lax and Phillips [93]. A beautiful overview of striking developments pertaining to dynamics on geometrically finite hyperbolic manifolds with applications to Apollonian circle packings (and beyond) is contained in Hee Oh's ICM report [114].
5 On the other hand, as detailed in [21, 85, 123], when torus intervenes, the saturation most likely fails. Tori pose particularly difficult problems, in terms of sparsity of elements in an orbit, strong approximation and diophantine properties: see [104] for a discussion of Artin's Conjecture in the context of strong approximation.
groups such as the Apollonian group, are crucial ingredients in executing Brun combinatorial sieve in this setting.
1.4. The strong approximation for S L n ( Z ) S L n ( Z ) SL_(n)(Z)\mathrm{SL}_{n}(\mathbb{Z})SLn(Z), asserting that the reduction π q Ï€ q pi_(q)\pi_{q}Ï€q modulo q q qqq is onto, is a consequence of the Chinese remainder theorem; its extension to arithmetic groups is far less elementary but well understood [118]. If S S SSS is a finite symmetric generating set of SL n ( Z ) SL n ⁡ ( Z ) SL_(n)(Z)\operatorname{SL}_{n}(\mathbb{Z})SLn⁡(Z), strong approximation is equivalent to the assertion that the Cayley graphs E ( S L n ( Z / q Z ) , π q ( S ) ) E S L n ( Z / q Z ) , Ï€ q ( S ) E(SL_(n)(Z//qZ),pi_(q)(S))\mathcal{E}\left(\mathrm{SL}_{n}(\mathbb{Z} / q \mathbb{Z}), \pi_{q}(S)\right)E(SLn(Z/qZ),Ï€q(S)) are connected. The quantification of this statement, asserting that they are in fact highly-connected, that is to say, form a family of expanders, is what we mean by superstrong approximation. The proof of the expansion property for S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z) has its roots in Selberg's celebrated lower bound [131] of 3 16 3 16 (3)/(16)\frac{3}{16}316 for the first eigenvalue of the Laplacian on the hyperbolic surfaces associated with congruence subgroups of S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z). The generalization of the expansion property to G ( Z ) G ( Z ) G(Z)G(\mathbb{Z})G(Z) where G G GGG is a semisimple matrix group defined over Q Q Q\mathbb{Q}Q is also known thanks to developments towards the general Ramanujan conjectures that have been established; this expansion property is also referred to as property τ Ï„ tau\tauÏ„ for congruence subgroups [133].
Let Γ Î“ Gamma\GammaΓ be a finitely generated subgroup of GL n ( Z ) GL n ⁡ ( Z ) GL_(n)(Z)\operatorname{GL}_{n}(\mathbb{Z})GLn⁡(Z) and let G = Zcl ( Γ ) G = Zcl ⁡ ( Γ ) G=Zcl(Gamma)G=\operatorname{Zcl}(\Gamma)G=Zcl⁡(Γ). The discussion of the previous paragraph applies if Γ Î“ Gamma\GammaΓ is of finite index in G ( Z ) G ( Z ) G(Z)G(\mathbb{Z})G(Z). However, if Γ Î“ Gamma\GammaΓ is thin, that is to say, of infinite index in G ( Z ) G ( Z ) G(Z)G(\mathbb{Z})G(Z), then vol ( G ( R ) Γ ) = vol ⁡ ( G ( R ) ∖ Γ ) = ∞ vol(G(R)\\Gamma)=oo\operatorname{vol}(G(\mathbb{R}) \backslash \Gamma)=\inftyvol⁡(G(R)∖Γ)=∞ and the techniques used to prove both of these properties do not apply. It is remarkable that, under suitable natural hypothesis, strong approximation continues to hold in this thin context, as proved by Matthews, Vaserstein, and Weisfeiler in 1984 [105,143]. That the expansion property might continue to hold for thin groups was first suggested by Lubotzky and Weiss in 1993 [101]; for S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z), the issue is neatly encapsulated in the following 1-2-3 question of Lubotzky [99]. For a prime p 5 p ≥ 5 p >= 5p \geq 5p≥5 and i = 1 , 2 , 3 i = 1 , 2 , 3 i=1,2,3i=1,2,3i=1,2,3, let us define S p i = { ( 1 i 0 1 ) , ( 1 0 i 1 ) } S p i = 1 i 0 1 , 1 0 i 1 S_(p)^(i)={([1,i],[0,1]),([1,0],[i,1])}S_{p}^{i}=\left\{\left(\begin{array}{cc}1 & i \\ 0 & 1\end{array}\right),\left(\begin{array}{ll}1 & 0 \\ i & 1\end{array}\right)\right\}Spi={(1i01),(10i1)}. Let E p i = E ( S L 2 ( Z / p Z ) , S p i ) E p i = E S L 2 ( Z / p Z ) , S p i E_(p)^(i)=E(SL_(2)(Z//pZ),S_(p)^(i))\mathscr{E}_{p}^{i}=\mathscr{E}\left(\mathrm{SL}_{2}(\mathbb{Z} / p \mathbb{Z}), S_{p}^{i}\right)Epi=E(SL2(Z/pZ),Spi), the Cayley graph of S L 2 ( Z / p Z ) S L 2 ( Z / p Z ) SL_(2)(Z//pZ)\mathrm{SL}_{2}(\mathbb{Z} / p \mathbb{Z})SL2(Z/pZ) with respect to S p i S p i S_(p)^(i)S_{p}^{i}Spi. By Selberg's theorem, E p 1 E p 1 E_(p)^(1)\mathscr{E}_{p}^{1}Ep1 and E p 2 E p 2 E_(p)^(2)\mathscr{E}_{p}^{2}Ep2 are families of expander graphs. However, the group ( 1 3 0 1 ) , ( 1 0 3 1 ) 1 3 0 1 , 1 0 3 1 (:([1,3],[0,1]),([1,0],[3,1]):)\left\langle\left(\begin{array}{ll}1 & 3 \\ 0 & 1\end{array}\right),\left(\begin{array}{ll}1 & 0 \\ 3 & 1\end{array}\right)\right\rangle⟨(1301),(1031)⟩ has infinite index in S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z) and thus does not come under the purview of Selberg's theorem.
In my thesis [66], extending the work of Sarnak and Xue [129], [128] for cocompact arithmetic lattices, a generalization of Selberg's theorem for infinite index "congruence" subgroups of S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z) was proved; for such subgroups with a high enough Hausdorff dimension of the limit set, a spectral gap property was established. Following the groundbreaking work of Helfgott [77] (which builds crucially on sum-product estimate in F p F p F_(p)\mathbb{F}_{p}Fp due to Bourgain, Katz, and Tao [27]), Bourgain and Gamburd [13] gave a complete answer to Lubotzky's question. The method introduced in [ 12 , 13 ] [ 12 , 13 ] [12,13][12,13][12,13] and developed in a series of papers [14-19] became known as "Bourgain-Gamburd expansion machine"; thanks to a number of major developments by many people [ 22 , 28 , 35 , 82 , 91 , 115 , 120 , 122 , 124 ] [ 22 , 28 , 35 , 82 , 91 , 115 , 120 , 122 , 124 ] [22,28,35,82,91,115,120,122,124][22,28,35,82,91,115,120,122,124][22,28,35,82,91,115,120,122,124], the general superstrong approximation for thin groups is now known. The state-of-the-art is summarized in Thin groups and superstrong approximation [36] which contains an expanded version of most of the invited lectures from the eponymous MSRI "Hot Topics" workshop, in the surveys by Breuillard [33] and Helfgott [78], and in the book by Tao "Expansion in finite simple groups of Lie type" [140].
1.5. We return to the progress on Conjecture 3 [23-26]. Our first result [25] asserts that there is a very large orbit.
Theorem 1. Fix ε > 0 ε > 0 epsi > 0\varepsilon>0ε>0. Then for p p ppp large prime, there is a Γ Î“ Gamma\GammaΓ orbit C ( p ) C ( p ) C(p)\mathcal{C}(p)C(p) in X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p) for which
(1.4) | X ( p ) ( p ) | p ε (1.4) X ∗ ( p ) ∖ ⨀ ( p ) ≤ p ε {:(1.4)|X^(**)(p)\\⨀(p)| <= p^(epsi):}\begin{equation*} \left|X^{*}(p) \backslash \bigodot(p)\right| \leq p^{\varepsilon} \tag{1.4} \end{equation*}(1.4)|X∗(p)∖⨀(p)|≤pε
(note that | X ( p ) | p 2 X ∗ ( p ) ∼ p 2 |X^(**)(p)|∼p^(2)\left|X^{*}(p)\right| \sim p^{2}|X∗(p)|∼p2 ), and any Γ Î“ Gamma\GammaΓ orbit D ( p ) D ( p ) D(p)\mathscr{D}(p)D(p) satisfies 6 6 ^(6)^{6}6
(1.5) | D ( p ) | ( log p ) 1 3 (1.5) | D ( p ) | ≫ ( log ⁡ p ) 1 3 {:(1.5)|D(p)|≫(log p)^((1)/(3)):}\begin{equation*} |\mathscr{D}(p)| \gg(\log p)^{\frac{1}{3}} \tag{1.5} \end{equation*}(1.5)|D(p)|≫(log⁡p)13
The proof, discussed in section 3, establishes the strong approximation conjecture, unless p 2 1 p 2 − 1 p^(2)-1p^{2}-1p2−1 is a very smooth number. In particular, the set of primes for which the strong approximation conjecture fails is very small.
Theorem 2. Let E E EEE be the set of primes for which the strong approximation conjecture fails. For ε > 0 ε > 0 epsi > 0\varepsilon>0ε>0, the number of primes p T p ≤ T p <= Tp \leq Tp≤T with p E p ∈ E p in Ep \in Ep∈E is at most T ε T ε T^(epsi)T^{\varepsilon}Tε, for T T TTT large.
Very recently, in a remarkable breakthrough, using geometric techniques involving Hurwitz stacks, degeneration, and some Galois theory, William Chen [45] proved the following result:
Theorem 3. Every Γ Î“ Gamma\GammaΓ orbit D ( p ) D ( p ) D(p)\mathscr{D}(p)D(p) has size divisible by p p ppp.
Combining Theorems 1 and 3 establishes Conjecture 3 for all sufficiently large primes; in combination with the following result established in [26], namely
Theorem 4. Assume that X ( Z / p Z ) X ∗ ( Z / p Z ) X^(**)(Z//pZ)X^{*}(\mathbb{Z} / p \mathbb{Z})X∗(Z/pZ) is connected. Then X ( Z / p k Z ) X ∗ Z / p k Z X^(**)(Z//p^(k)Z)X^{*}\left(\mathbb{Z} / p^{k} \mathbb{Z}\right)X∗(Z/pkZ) is connected for all k k kkk. it yields
Theorem 5. For all sufficiently large primes p p ppp, the group Γ Î“ Gamma\GammaΓ acts minimally on X ( Z p ) X ∗ Z p X^(**)(Z_(p))X^{*}\left(\mathbb{Z}_{p}\right)X∗(Zp).
We remark that Theorem 5 is not true for X ( R ) X ∗ ( R ) X^(**)(R)X^{*}(\mathbb{R})X∗(R); cf. section 4.1. While Conjecture 1 remains out of reach, the progress on strong approximation allows us to establish the following result on the diophantine 7 7 ^(7){ }^{7}7 properties of Markoff numbers [25]:
Theorem 6. Almost all Markoff numbers are composite, that is,
p M s p prime , p T 1 = o ( m M s m T 1 ) ∑ p ∈ M s p  prime  , p ≤ T   1 = o ∑ m ∈ M s m ≤ T   1 sum_({:[p inM^(s)],[p" prime "","p <= T]:})1=o(sum_({:[m inM^(s)],[m <= T]:})1)\sum_{\substack{p \in M^{s} \\ p \text { prime }, p \leq T}} 1=o\left(\sum_{\substack{m \in M^{s} \\ m \leq T}} 1\right)∑p∈Msp prime ,p≤T1=o(∑m∈Msm≤T1)
It is worth contrasting this result with the state of knowledge regarding the sequence H n = 2 n + b H n = 2 n + b H_(n)=2^(n)+bH_{n}=2^{n}+bHn=2n+b, which is just a little more sparse than the sequence of Markoff numbers, for which, by Zagier's result (1.2), we have M n A n M n ∼ A n M_(n)∼A^(sqrtn)M_{n} \sim A^{\sqrt{n}}Mn∼An. Even assuming the generalized Riemann Hypothesis, which allowed Hooley [79] to give a conditional proof of Artin's conjecture (cf. footnote 5), was not sufficient to establish that almost all members of the sequence H n H n H_(n)H_{n}Hn are composite: the conditional proof in [80] necessitated postulating additional "Hypothesis A."
1.6. The methods of proof of Theorems 1, 2, 4 discussed in Section 3 are robust enough to enable handling their extension to more general Markoff-type cubic surfaces, namely
(1.6) X k : Φ ( x 1 , x 2 , x 3 ) = x 1 2 + x 2 2 + x 3 2 x 1 x 2 x 3 = k (1.6) X k : Φ x 1 , x 2 , x 3 = x 1 2 + x 2 2 + x 3 2 − x 1 x 2 x 3 = k {:(1.6)X_(k):Phi(x_(1),x_(2),x_(3))=x_(1)^(2)+x_(2)^(2)+x_(3)^(2)-x_(1)x_(2)x_(3)=k:}\begin{equation*} X_{k}: \Phi\left(x_{1}, x_{2}, x_{3}\right)=x_{1}^{2}+x_{2}^{2}+x_{3}^{2}-x_{1} x_{2} x_{3}=k \tag{1.6} \end{equation*}(1.6)Xk:Φ(x1,x2,x3)=x12+x22+x32−x1x2x3=k
where the real dynamics was studied by Goldman [73], as discussed in Section 4.1; the family of surfaces S A , B , C , D C 3 S A , B , C , D ⊂ C 3 S_(A,B,C,D)subC^(3)S_{A, B, C, D} \subset \mathbb{C}^{3}SA,B,C,D⊂C3 given by
(1.7) x 1 2 + x 2 2 + x 3 2 + x 1 x 2 x 3 = A x 1 + B x 2 + C x 3 + D (1.7) x 1 2 + x 2 2 + x 3 2 + x 1 x 2 x 3 = A x 1 + B x 2 + C x 3 + D {:(1.7)x_(1)^(2)+x_(2)^(2)+x_(3)^(2)+x_(1)x_(2)x_(3)=Ax_(1)+Bx_(2)+Cx_(3)+D:}\begin{equation*} x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{1} x_{2} x_{3}=A x_{1}+B x_{2}+C x_{3}+D \tag{1.7} \end{equation*}(1.7)x12+x22+x32+x1x2x3=Ax1+Bx2+Cx3+D
where the real dynamics was studied by Cantat [38], as discussed in Section 4.2; those in [60] and even the general such nondegenerate cubic surface
(1.8) Y = Y ( α , β , γ , δ ) : i , j = 1 3 α i j x i x j + j = 1 3 β j x j + γ = δ x 1 x 2 x 3 (1.8) Y = Y ( α , β , γ , δ ) : ∑ i , j = 1 3   α i j x i x j + ∑ j = 1 3   β j x j + γ = δ x 1 x 2 x 3 {:(1.8)Y=Y(alpha","beta","gamma","delta):sum_(i,j=1)^(3)alpha_(ij)x_(i)x_(j)+sum_(j=1)^(3)beta_(j)x_(j)+gamma=deltax_(1)x_(2)x_(3):}\begin{equation*} Y=Y(\alpha, \beta, \gamma, \delta): \sum_{i, j=1}^{3} \alpha_{i j} x_{i} x_{j}+\sum_{j=1}^{3} \beta_{j} x_{j}+\gamma=\delta x_{1} x_{2} x_{3} \tag{1.8} \end{equation*}(1.8)Y=Y(α,β,γ,δ):∑i,j=13αijxixj+∑j=13βjxj+γ=δx1x2x3
with α i j , β j , γ , δ α i j , β j , γ , δ alpha_(ij),beta_(j),gamma,delta\alpha_{i j}, \beta_{j}, \gamma, \deltaαij,βj,γ,δ being integers.
The group Γ Y Γ Y Gamma_(Y)\Gamma_{Y}ΓY is again generated by the corresponding Vieta involutions R 1 , R 2 , R 3 R 1 , R 2 , R 3 R_(1),R_(2),R_(3)R_{1}, R_{2}, R_{3}R1,R2,R3. For such a Y Y YYY and action Γ Y Γ Y Gamma_(Y)\Gamma_{Y}ΓY, one must first show that there are only finitely many finite orbits in Y ( Q ¯ ) Y ( Q ¯ ) Y( bar(Q))Y(\overline{\mathbb{Q}})Y(Q¯), and that these may be determined effectively. The analogue of Conjecture 1 for Y Y YYY is that for p p ppp large, Γ Y Γ Y Gamma_(Y)\Gamma_{Y}ΓY has one big orbit on Y ( Z / p Z ) Y ( Z / p Z ) Y(Z//pZ)Y(\mathbb{Z} / p \mathbb{Z})Y(Z/pZ) and that the remaining orbits, if there are any, correspond to one of the finite Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ orbits determined above.
The determination of the finite orbits of Γ Î“ Gamma\GammaΓ on X k ( Q ¯ ) X k ( Q ¯ ) X_(k)( bar(Q))X_{k}(\overline{\mathbb{Q}})Xk(Q¯) and on S A , B , C , D ( Q ¯ ) S A , B , C , D ( Q ¯ ) S_(A,B,C,D)( bar(Q))S_{A, B, C, D}(\overline{\mathbb{Q}})SA,B,C,D(Q¯) has been carried out in [56] and [96], respectively. Remarkably for these, the Γ Î“ Gamma\GammaΓ action on affine 3space corresponds to the (nonlinear) monodromy group for Painlevé VI equations on their parameter spaces. In this way the finite orbits in question turn out to correspond bijectively to those Painlevé VI's which are algebraic functions of their independent variable.
In this setting of the more general surfaces Y Y YYY in (1.8), strong approximation for Y ( Z S ) Y Z S Y(Z_(S))Y\left(\mathbb{Z}_{S}\right)Y(ZS), where S S SSS is the set of primes dividing α 11 , α 22 , α 33 α 11 , α 22 , α 33 alpha_(11),alpha_(22),alpha_(33)\alpha_{11}, \alpha_{22}, \alpha_{33}α11,α22,α33 (so that Γ Y Γ Y Gamma_(Y)\Gamma_{Y}ΓY preserves the S S SSS-integers Z S Z S Z_(S)\mathbb{Z}_{S}ZS ), will follow from Conjecture 1 for Y Y YYY (and the results we can prove towards it, as in Theorem 2) once we have a point of infinite order in Y ( Z S ) Y Z S Y(Z_(S))Y\left(\mathbb{Z}_{S}\right)Y(ZS). If there is no such point, we can increase S S SSS or replace Z Z Z\mathbb{Z}Z by O K O K O_(K)\mathcal{O}_{K}OK, the ring of integers in a number field K / Q K / Q K//QK / \mathbb{Q}K/Q, to produce such a point and with it strong approximation for Y ( ( O K ) S ) Y O K S Y((O_(K))_(S))Y\left(\left(\mathcal{O}_{K}\right)_{S}\right)Y((OK)S).
Vojta's conjectures and the results proven towards them [51,141] assert that cubic and higher-degree affine surfaces typically have few S S SSS-integral points. In the rare cases where these points are Zariski dense, such as tori (e.g., N ( x 1 , x 2 , x 3 ) = k N x 1 , x 2 , x 3 = k N(x_(1),x_(2),x_(3))=kN\left(x_{1}, x_{2}, x_{3}\right)=kN(x1,x2,x3)=k where N N NNN is the norm form of a cubic extension of Q Q Q\mathbb{Q}Q ), strong approximation fails. So these Markoff surfaces appear to be rather special affine cubic surfaces not only having a Zariski dense set of integral points, but also a robust strong approximation.
1.7. Zagier's result (1.2) can be viewed as a statement about asymptotic growth of integral points on the Markoff variety, | X ( Z ) B ( T ) | ( log T ) 2 | X ( Z ) ∩ B ( T ) | ∼ ( log ⁡ T ) 2 |X(Z)nn B(T)|∼(log T)^(2)|X(\mathbb{Z}) \cap B(T)| \sim(\log T)^{2}|X(Z)∩B(T)|∼(log⁡T)2. In Section 5 we discuss the work in [68], establishing an asymptotic formula for the number of integer solutions to the Markoff-Hurwitz equation
(1.9) x 1 2 + x 2 2 + + x n 2 = a x 1 x 2 x n + k (1.9) x 1 2 + x 2 2 + ⋯ + x n 2 = a x 1 x 2 ⋯ x n + k {:(1.9)x_(1)^(2)+x_(2)^(2)+cdots+x_(n)^(2)=ax_(1)x_(2)cdotsx_(n)+k:}\begin{equation*} x_{1}^{2}+x_{2}^{2}+\cdots+x_{n}^{2}=a x_{1} x_{2} \cdots x_{n}+k \tag{1.9} \end{equation*}(1.9)x12+x22+⋯+xn2=ax1x2⋯xn+k
giving an interpretation of the exponent of growth, which for n > 3 n > 3 n > 3n>3n>3 is not integral, in terms of the unique parameter for which there exists a certain conformal measure on a projective space.
1.8. The issue of the existence of a single integral solution to (1.9) for general a a aaa and k k kkk, even for n = 3 n = 3 n=3n=3n=3, is quite subtle; see [112,130]. In the work of Ghosh and Sarnak [71], the Hasse principle is established to hold for Markoff-type cubic surfaces X ( k ) X ( k ) X(k)X(k)X(k) given by (1.6) for almost all k k kkk, but it also fails to hold for infinitely many k k kkk; this work is discussed in Section 6.
1.9. Regrettably, the space/time constraints prevented us from covering cognate results pertaining to arithmetic and dynamics on K 3 K 3 K3\mathrm{K} 3K3 surfaces; see [ 37 , 65 , 106 , 108 , 109 , 135 ] [ 37 , 65 , 106 , 108 , 109 , 135 ] [37,65,106,108,109,135][37,65,106,108,109,135][37,65,106,108,109,135] and references therein. The Markoff equation over quadratic imaginary fields is studied in [134]. Potential cryptographic applications of Markoff graphs are discussed in [64].
1.10. To conclude this introduction, let us note that X k X k X_(k)X_{k}Xk is the relative character variety of representations of the fundamental group of a surface of genus 1 with one puncture to S L 2 S L 2 SL_(2)\mathrm{SL}_{2}SL2. The action of the mapping class group is that of Γ Î“ Gamma\GammaΓ. More generally, the (affine) relative character variety V k V k V_(k)V_{k}Vk of representation of π 1 ( Σ g , n ) Ï€ 1 Σ g , n pi_(1)(Sigma_(g,n))\pi_{1}\left(\Sigma_{g, n}\right)Ï€1(Σg,n), a surface of genus g g ggg with n n nnn punctures, into S L 2 S L 2 SL_(2)\mathrm{SL}_{2}SL2 is defined over Z Z Z\mathbb{Z}Z, and one can study the diophantine properties of V k ( Z ) V k ( Z ) V_(k)(Z)V_{k}(\mathbb{Z})Vk(Z). In the work of Whang [144-146], it was shown that V k V k V_(k)V_{k}Vk has a projective compactification relative to which V k V k V_(k)V_{k}Vk is "log-Calabi-Yau." According to the conjectures of Vojta, this places V k V k V_(k)V_{k}Vk as being in the same threshold setting as affine cubic surfaces. Moreover, V k ( Z ) V k ( Z ) V_(k)(Z)V_{k}(\mathbb{Z})Vk(Z) has a full descent in that the mapping class group acts via nonlinear morphisms on V k ( Z ) V k ( Z ) V_(k)(Z)V_{k}(\mathbb{Z})Vk(Z) with finitely many orbits. These and more general character varieties connected with higher Teichmüller theory offer a rich family of threshold affine varieties for which one can approach the study of integral points.

2. THE UNREASONABLE(?) UBIQUITY OF MARKOFF EQUATION

Markoff equation and numbers appear in a surprising variety of contexts: see, for example, [1] (subtitled Mathematical Journey from Irrational Numbers to Perfect Matchings) and the references therein.
2.1. The Markoff chain. Equation (1.1) was discovered by Markoff in 1879 in his work on badly approximable numbers. As the sentiment 8 8 ^(8){ }^{8}8 expressed by Frobenius [62] in 1913 seems to remain true today, we briefly review the context and statement of Markoff's theorem.
Let α α alpha\alphaα be an irrational number. A celebrated theorem of Hurwitz asserts that α α alpha\alphaα admits
infinitely many rational approximations p / q p / q p//qp / qp/q such that | α p q | < 1 5 q 2 α − p q < 1 5 q 2 |alpha-(p)/(q)| < (1)/(sqrt5q^(2))\left|\alpha-\frac{p}{q}\right|<\frac{1}{\sqrt{5} q^{2}}|α−pq|<15q2, and, moreover, that if α α alpha\alphaα is G L 2 ( Z ) G L 2 ( Z ) GL_(2)(Z)\mathrm{GL}_{2}(\mathbb{Z})GL2(Z)-equivalent to the Golden Ratio θ 1 = ( 1 + 5 ) / 2 θ 1 = ( 1 + 5 ) / 2 theta_(1)=(1+sqrt5)//2\theta_{1}=(1+\sqrt{5}) / 2θ1=(1+5)/2, in the sense that α = a θ 1 + b c θ 1 + d α = a θ 1 + b c θ 1 + d alpha=(atheta_(1)+b)/(ctheta_(1)+d)\alpha=\frac{a \theta_{1}+b}{c \theta_{1}+d}α=aθ1+bcθ1+d for some integers a , b , c , d a , b , c , d a,b,c,da, b, c, da,b,c,d with a d b c = ± 1 a d − b c = ± 1 ad-bc=+-1a d-b c= \pm 1ad−bc=±1, the above result is sharp and the constant 1 5 1 5 (1)/(sqrt5)\frac{1}{\sqrt{5}}15 cannot be replaced by any smaller.
Suppose next that α α alpha\alphaα is not G L 2 ( Z ) G L 2 ( Z ) GL_(2)(Z)\mathrm{GL}_{2}(\mathbb{Z})GL2(Z)-equivalent to θ 1 θ 1 theta_(1)\theta_{1}θ1. Then the result of Markoff's doctoral advisors, Korkine and Zolotareff, [88] asserts that α α alpha\alphaα admits infinitely many rational approximations p / q p / q p//qp / qp/q such that | α p q | < 1 8 q 2 α − p q < 1 8 q 2 |alpha-(p)/(q)| < (1)/(sqrt8q^(2))\left|\alpha-\frac{p}{q}\right|<\frac{1}{\sqrt{8} q^{2}}|α−pq|<18q2, and, moreover, that the constant 1 8 1 8 (1)/(sqrt8)\frac{1}{\sqrt{8}}18 is sharp if and only if α α alpha\alphaα is G L 2 ( Z ) G L 2 ( Z ) GL_(2)(Z)\mathrm{GL}_{2}(\mathbb{Z})GL2(Z)-equivalent θ 2 = 1 + 2 θ 2 = 1 + 2 theta_(2)=1+sqrt2\theta_{2}=1+\sqrt{2}θ2=1+2.
The general result found by Markoff in his Habilitation and published in 1879 and 1880 in Mathematische Annalen is as follows.
Markoff's Theorem. Let M = { 1 , 2 , 5 , 13 , 29 , 34 , 89 , 169 , 194 , } M = { 1 , 2 , 5 , 13 , 29 , 34 , 89 , 169 , 194 , … } M={1,2,5,13,29,34,89,169,194,dots}\mathbb{M}=\{1,2,5,13,29,34,89,169,194, \ldots\}M={1,2,5,13,29,34,89,169,194,…} be the sequence of Markoff numbers. There is a sequence of associated quadratic irrationals θ i Q ( Δ i ) θ i ∈ Q Δ i theta_(i)inQ(sqrt(Delta_(i)))\theta_{i} \in \mathbb{Q}\left(\sqrt{\Delta_{i}}\right)θi∈Q(Δi), where Δ i = 9 m i 2 4 Δ i = 9 m i 2 − 4 Delta_(i)=9m_(i)^(2)-4\Delta_{i}=9 m_{i}^{2}-4Δi=9mi2−4 and m i m i m_(i)m_{i}mi is the i i iii th element of the sequence, with the following property. Let α α alpha\alphaα be a real irrational, not G L 2 ( Z ) G L 2 ( Z ) GL_(2)(Z)G L_{2}(\mathbb{Z})GL2(Z)-equivalent to any of the numbers θ i θ i theta_(i)\theta_{i}θi whenever m i < m i < m_(i) <m_{i}<mi< m j m j m_(j)m_{j}mj. Then α α alpha\alphaα admits infinitely many rational approximations p / q p / q p//qp / qp/q with | α p q | < m j Δ j q 2 α − p q < m j Δ j q 2 |alpha-(p)/(q)| < (m_(j))/(sqrt(Delta_(j))q^(2))\left|\alpha-\frac{p}{q}\right|<\frac{m_{j}}{\sqrt{\Delta_{j}} q^{2}}|α−pq|<mjΔjq2; the constant m j / Δ j m j / Δ j m_(j)//sqrt(Delta_(j))m_{j} / \sqrt{\Delta_{j}}mj/Δj is sharp if and only if α α alpha\alphaα is G L 2 ( Z ) G L 2 ( Z ) GL_(2)(Z)G L_{2}(\mathbb{Z})GL2(Z)-equivalent to θ h θ h theta_(h)\theta_{h}θh, for some h h hhh such that m h = m j m h = m j m_(h)=m_(j)m_{h}=m_{j}mh=mj.
2.2. Continued fractions and binary quadratic forms. The first paper by Markoff [102] used the theory of continued fractions, while the second memoir [103] was based on the theory of binary indefinite quadratic forms, with the final result stated as a theorem on minima of binary indefinite quadratic forms.
The alternative approach based on indefinite binary quadratic forms was the subject of an important memoir by Frobenius [62] and complete details were finally provided by Remak [121] and much simplified by Cassels [39,40].
2.3. The geometry of Markoff numbers. A third way of looking at the problem, via hyperbolic geometry, was introduced by Gorshkov [74] in his thesis of 1953, but published only in 1977. The connection with hyperbolic geometry was rediscovered, in a somewhat different way, by Cohn [46]. The paper by Caroline Series [132] contains a beautiful exposition of the problem in this context.
2.4. Cohn tree and Nielsen transformations. Cohn is also credited for the interpretation of the problem [47] in the context of primitive words in F 2 F 2 F_(2)F_{2}F2, the free group on two generators. Its automorphism group Φ 2 = Aut ( F 2 ) Φ 2 = Aut ⁡ F 2 Phi_(2)=Aut(F_(2))\Phi_{2}=\operatorname{Aut}\left(F_{2}\right)Φ2=Aut⁡(F2) is generated by the following Nielsen transformations: ( a , b ) P = ( b , a ) , ( a , b ) σ = ( a 1 , b ) , ( a , b ) U = ( a 1 , a b ) ( a , b ) P = ( b , a ) , ( a , b ) σ = a − 1 , b , ( a , b ) U = a − 1 , a b (a,b)^(P)=(b,a),(a,b)^(sigma)=(a^(-1),b),(a,b)^(U)=(a^(-1),ab)(a, b)^{P}=(b, a),(a, b)^{\sigma}=\left(a^{-1}, b\right),(a, b)^{U}=\left(a^{-1}, a b\right)(a,b)P=(b,a),(a,b)σ=(a−1,b),(a,b)U=(a−1,ab). Let V = σ U V = σ U V=sigma UV=\sigma UV=σU. Then ( a , b ) V = ( a , a b ) ( a , b ) V = ( a , a b ) (a,b)^(V)=(a,ab)(a, b)^{V}=(a, a b)(a,b)V=(a,ab).
The Cohn tree is a binary tree with root a b a b aba bab, branching to the top with U U UUU and to the bottom with V V VVV,
Markoff numbers are obtained from the Cohn tree by taking a third of the trace of the matrix obtained by substituting the matrices A = ( 5 2 2 1 ) A = 5 2 2 1 A=([5,2],[2,1])A=\left(\begin{array}{ll}5 & 2 \\ 2 & 1\end{array}\right)A=(5221) and B = ( 2 1 1 1 ) B = 2 1 1 1 B=([2,1],[1,1])B=\left(\begin{array}{ll}2 & 1 \\ 1 & 1\end{array}\right)B=(2111) in place of a , b a , b a,ba, ba,b and performing the matrix multiplication.
2.5. Nielsen systems and product replacement graphs. Conjecture 3 is a special case of Conjecture Q Q QQQ made by McCullough and Wanderley [107] in the context of Nielsen systems and product replacement graphs.
Given a group G G GGG, the product replacement graph Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) introduced in [42] in connection with computing in finite groups is defined as follows. The vertices of Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) consist of all k k kkk-tuples of generators ( g 1 , , g k ) g 1 , … , g k (g_(1),dots,g_(k))\left(g_{1}, \ldots, g_{k}\right)(g1,…,gk) of the group G G GGG. For every ( i , j ) , 1 i ( i , j ) , 1 ≤ i (i,j),1 <= i(i, j), 1 \leq i(i,j),1≤i, j k , i j j ≤ k , i ≠ j j <= k,i!=jj \leq k, i \neq jj≤k,i≠j, there is an edge corresponding to transformations L i , j ± L i , j ± L_(i,j)^(+-)L_{i, j}^{ \pm}Li,j±and R i , j ± R i , j ± R_(i,j)^(+-)R_{i, j}^{ \pm}Ri,j±, where R i , j ± : ( g 1 , , g i , , g k ) ( g 1 , , g i g j ± 1 , , g k ) R i , j ± : g 1 , … , g i , … , g k → g 1 , … , g i â‹… g j ± 1 , … , g k R_(i,j)^(+-):(g_(1),dots,g_(i),dots,g_(k))rarr(g_(1),dots,g_(i)*g_(j)^(+-1),dots,g_(k))R_{i, j}^{ \pm}:\left(g_{1}, \ldots, g_{i}, \ldots, g_{k}\right) \rightarrow\left(g_{1}, \ldots, g_{i} \cdot g_{j}^{ \pm 1}, \ldots, g_{k}\right)Ri,j±:(g1,…,gi,…,gk)→(g1,…,giâ‹…gj±1,…,gk) and L i , j ± L i , j ± L_(i,j)^(+-)L_{i, j}^{ \pm}Li,j±defined similarly. The graphs Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) are regular, of degree 4 k ( k 1 ) 4 k ( k − 1 ) 4k(k-1)4 k(k-1)4k(k−1), possibly with loops and multiple edges. The connectivity of Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) has been the subject of intensive recent investigations; for G = S L 2 ( p ) G = S L 2 ( p ) G=SL_(2)(p)G=\mathrm{SL}_{2}(p)G=SL2(p) and k 3 k ≥ 3 k >= 3k \geq 3k≥3, it was established by Gilman in [72].
In the case of the free group F k F k F_(k)F_{k}Fk, the moves L i , j ± L i , j ± L_(i,j)^(+-)L_{i, j}^{ \pm}Li,j±and R i , j ± R i , j ± R_(i,j)^(+-)R_{i, j}^{ \pm}Ri,j±defined above correspond to Nielsen moves on Γ k ( F k ) Γ k F k Gamma_(k)(F_(k))\Gamma_{k}\left(F_{k}\right)Γk(Fk). For every group G G GGG, the set Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) can be identified with E = Epi ( F k , G ) E = Epi ⁡ F k , G E=Epi(F_(k),G)E=\operatorname{Epi}\left(F_{k}, G\right)E=Epi⁡(Fk,G), the set of epimorphisms from F k F k F_(k)F_{k}Fk onto G G GGG, and the group A = Aut ( F k ) A = Aut ⁡ F k A=Aut(F_(k))A=\operatorname{Aut}\left(F_{k}\right)A=Aut⁡(Fk) acts on E E EEE in the following way: if α A α ∈ A alpha in A\alpha \in Aα∈A and φ E , α ( φ ) = φ α 1 φ ∈ E , α ( φ ) = φ â‹… α − 1 varphi in E,alpha(varphi)=varphi*alpha^(-1)\varphi \in E, \alpha(\varphi)=\varphi \cdot \alpha^{-1}φ∈E,α(φ)=φ⋅α−1. A long-standing problem is whether Aut ( F k ) Aut ⁡ F k Aut(F_(k))\operatorname{Aut}\left(F_{k}\right)Aut⁡(Fk) has property (T) for k 4 k ≥ 4 k >= 4k \geq 4k≥4; in [100] Lubotzky and Pak observed that a positive answer to this problem implies the expansion of Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) for all G G GGG and proved that Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) are expanders when G G GGG is nilpotent of class l l lll and both k k kkk and l l lll are fixed. Property (T) for Aut ( F k ) Aut ⁡ F k Aut(F_(k))\operatorname{Aut}\left(F_{k}\right)Aut⁡(Fk) for k 5 k ≥ 5 k >= 5k \geq 5k≥5 was recently established in [84]. 9 9 ^(9){ }^{9}9 Note that Aut ( F 2 ) Aut ⁡ F 2 Aut(F_(2))\operatorname{Aut}\left(F_{2}\right)Aut⁡(F2) and Aut ( F 3 ) Aut ⁡ F 3 Aut(F_(3))\operatorname{Aut}\left(F_{3}\right)Aut⁡(F3) do not satisfy property (T), while the problem is still open for k = 4 k = 4 k=4k=4k=4.
In a joint work with Pak [69], we established a connection between the expansion coefficient of the product replacement graph Γ k ( G ) Γ k ( G ) Gamma_(k)(G)\Gamma_{k}(G)Γk(G) and the minimal expansion coefficient of a Cayley graph of G G GGG with k k kkk generators, and, in particular, proved that for k > 3 k > 3 k > 3k>3k>3 the product
9 The proof stems from the groundbreaking observation by Ozawa [116] that G G GGG satisfies Kazhdan's property (T) if there exist λ > 0 λ > 0 lambda > 0\lambda>0λ>0 and finitely many elements ξ i ξ i xi_(i)\xi_{i}ξi of R [ G ] R [ G ] R[G]\mathbb{R}[G]R[G] such that Δ 2 λ Δ = i ξ i ξ i Δ 2 − λ Δ = ∑ i   ξ i ∗ ξ i Delta^(2)-lambda Delta=sum_(i)xi_(i)^(**)xi_(i)\Delta^{2}-\lambda \Delta=\sum_{i} \xi_{i}^{*} \xi_{i}Δ2−λΔ=∑iξi∗ξi where Δ Î” Delta\DeltaΔ is the Laplacian of the finite symmetric generating set of G G GGG.
replacement graphs Γ k ( SL ( 2 , p ) ) Γ k ( SL ⁡ ( 2 , p ) ) Gamma_(k)(SL(2,p))\Gamma_{k}(\operatorname{SL}(2, p))Γk(SL⁡(2,p)) form an expander family under assumption of strong uniform expansion of SL ( 2 , p ) ( 2 , p ) (2,p)(2, p)(2,p) on k k kkk generators. In a joint work with Breuillard [34], combining the "expansion machine" [13] with the uniform Tits Alternative 10 10 ^(10){ }^{10}10 established by Breuillard [32], we proved that Cayley graphs of SL ( 2 , p ) SL ⁡ ( 2 , p ) SL(2,p)\operatorname{SL}(2, p)SL⁡(2,p) are strongly uniformly expanding for infinitely many primes of density one. Consequently, the following form of nonlinear superstrong approximation is obtained:
Theorem 7. Let k > 3 k > 3 k > 3k>3k>3. The family of product replacement graphs { Γ k ( SL ( 2 , p n ) ) } n Γ k SL ⁡ 2 , p n n {Gamma_(k)(SL(2,p_(n)))}_(n)\left\{\Gamma_{k}\left(\operatorname{SL}\left(2, p_{n}\right)\right)\right\}_{n}{Γk(SL⁡(2,pn))}n forms a family of expanders for infinitely many primes p n p n p_(n)p_{n}pn of density one.
As detailed in [107], the situation is different for the product replacement graph of SL ( 2 , F p ) SL ⁡ 2 , F p SL(2,F_(p))\operatorname{SL}\left(2, \mathbb{F}_{p}\right)SL⁡(2,Fp) on 2 generators, due to Fricke identity for 2 × 2 2 × 2 2xx22 \times 22×2 matrices M M MMM and N N NNN :
(2.1) tr ( M ) 2 + tr ( N ) 2 + tr ( M N ) 2 = tr ( M ) tr ( N ) tr ( M N ) + tr ( [ M , N ] ) + 2 (2.1) tr ⁡ ( M ) 2 + tr ⁡ ( N ) 2 + tr ⁡ ( M N ) 2 = tr ⁡ ( M ) tr ⁡ ( N ) tr ⁡ ( M N ) + tr ⁡ ( [ M , N ] ) + 2 {:(2.1)tr(M)^(2)+tr(N)^(2)+tr(MN)^(2)=tr(M)tr(N)tr(MN)+tr([M","N])+2:}\begin{equation*} \operatorname{tr}(M)^{2}+\operatorname{tr}(N)^{2}+\operatorname{tr}(M N)^{2}=\operatorname{tr}(M) \operatorname{tr}(N) \operatorname{tr}(M N)+\operatorname{tr}([M, N])+2 \tag{2.1} \end{equation*}(2.1)tr⁡(M)2+tr⁡(N)2+tr⁡(MN)2=tr⁡(M)tr⁡(N)tr⁡(MN)+tr⁡([M,N])+2
Letting x 1 = tr ( M ) , x 2 = tr ( N ) , x 3 = tr ( M N ) x 1 = tr ⁡ ( M ) , x 2 = tr ⁡ ( N ) , x 3 = tr ⁡ ( M N ) x_(1)=tr(M),x_(2)=tr(N),x_(3)=tr(MN)x_{1}=\operatorname{tr}(M), x_{2}=\operatorname{tr}(N), x_{3}=\operatorname{tr}(M N)x1=tr⁡(M),x2=tr⁡(N),x3=tr⁡(MN), the Q Q QQQ conjecture 11 11 ^(11){ }^{11}11 in [107] amounts to the assertion of the strong approximation for the surfaces
X k : Φ ( x 1 , x 2 , x 3 ) = k (2.3) Φ ( x 1 , x 2 , x 3 ) = x 1 2 + x 2 2 + x 3 2 x 1 x 2 x 3 X k : Φ x 1 , x 2 , x 3 = k (2.3) Φ x 1 , x 2 , x 3 = x 1 2 + x 2 2 + x 3 2 − x 1 x 2 x 3 {:[X_(k):Phi(x_(1),x_(2),x_(3))=k],[(2.3)Phi(x_(1),x_(2),x_(3))=x_(1)^(2)+x_(2)^(2)+x_(3)^(2)-x_(1)x_(2)x_(3)]:}\begin{align*} X_{k}: \Phi\left(x_{1}, x_{2}, x_{3}\right) & =k \\ \Phi\left(x_{1}, x_{2}, x_{3}\right) & =x_{1}^{2}+x_{2}^{2}+x_{3}^{2}-x_{1} x_{2} x_{3} \tag{2.3} \end{align*}Xk:Φ(x1,x2,x3)=k(2.3)Φ(x1,x2,x3)=x12+x22+x32−x1x2x3
and k = tr ( [ M , N ] ) + 2 k = tr ⁡ ( [ M , N ] ) + 2 k=tr([M,N])+2k=\operatorname{tr}([M, N])+2k=tr⁡([M,N])+2, with Markoff surface 12 12 ^(12){ }^{12}12 being the special case corresponding to tr ( [ M , N ] ) = 2 tr ⁡ ( [ M , N ] ) = − 2 tr([M,N])=-2\operatorname{tr}([M, N])=-2tr⁡([M,N])=−2.

3. STRONG APPROXIMATION

We give a brief overview of the methods and tools used in the proof of Theorems 1 and 2 and some comments about their extensions to the setting of more general surfaces of Markoff type. Theorem 1, in the weaker form that | C ( p ) | | X ( p ) | | C ( p ) | ∼ X ∗ ( p ) |C(p)|∼|X^(**)(p)||\mathcal{C}(p)| \sim\left|X^{*}(p)\right||C(p)|∼|X∗(p)| as p p → ∞ p rarr oop \rightarrow \inftyp→∞, can be viewed as the finite field analogue of [73] where it is shown that the action of Γ Î“ Gamma\GammaΓ on the compact real components of the relative character variety of the mapping class group of the once punctured torus is ergodic. As in [73] our proof makes use of the rotations τ i j R j Ï„ i j ∘ R j tau_(ij)@R_(j)\tau_{i j} \circ R_{j}Ï„ij∘Rj, i j i ≠ j i!=ji \neq ji≠j, where τ i j Ï„ i j tau_(ij)\tau_{i j}Ï„ij permutes x i x i x_(i)x_{i}xi and x j x j x_(j)x_{j}xj. These preserve the conic sections gotten by intersecting X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p) with the planes y k = x k y k = x k y_(k)=x_(k)y_{k}=x_{k}yk=xk ( k k kkk different from i i iii and j j jjj ). If τ i j R j Ï„ i j ∘ R j tau_(ij)@R_(j)\tau_{i j} \circ R_{j}Ï„ij∘Rj has order t 1 t 1 t_(1)t_{1}t1 (here t 1 p ( p 1 ) ( p + 1 ) ) t 1 ∣ p ( p − 1 ) ( p + 1 ) {:t_(1)∣p(p-1)(p+1))\left.t_{1} \mid p(p-1)(p+1)\right)t1∣p(p−1)(p+1)), then x x xxx and these t 1 t 1 t_(1)t_{1}t1 points of the conic section are connected (i.e., are in the same Γ Î“ Gamma\GammaΓ orbit). If t 1 t 1 t_(1)t_{1}t1 is maximal (i.e., is p , p 1 p , p − 1 p,p-1p, p-1p,p−1, or p + 1 p + 1 p+1p+1p+1 ), then this entire conic section is connected and such conic sections in different planes which intersect are also connected. This leads to a large component which we denote by ( p ) と ( p ) と(p)と(p)と(p).
10 This states that if the subgroup of G L d ( K ) G L d ( K ) GL_(d)(K)\mathrm{GL}_{d}(K)GLd(K) (where K K KKK is an algebraic number field) generated by F F FFF is not virtually solvable, then there is some N N N ∈ N N inNN \in \mathbb{N}N∈N, depending only on d d ddd, such that ( F F 1 { 1 } ) N F ∪ F − 1 ∪ { 1 } N (F uuF^(-1)uu{1})^(N)\left(F \cup F^{-1} \cup\{1\}\right)^{N}(F∪F−1∪{1})N contains two elements that generate a nonabelian free group.
11 See the paper of Will Chen [45] for the discussion of the relation between this conjecture and the connectivity properties of the moduli spaces of elliptic curves with G = SL ( 2 , p ) G = SL ⁡ ( 2 , p ) G=SL(2,p)G=\operatorname{SL}(2, p)G=SL⁡(2,p) structures.
12 Note that the congruence x 2 + y 2 + z 2 x y z ( mod 3 ) x 2 + y 2 + z 2 ≡ x y z ( mod 3 ) x^(2)+y^(2)+z^(2)-=xyz(mod3)x^{2}+y^{2}+z^{2} \equiv x y z(\bmod 3)x2+y2+z2≡xyz(mod3) has no nontrivial solutions.
If our starting rotation has order t 1 t 1 t_(1)t_{1}t1 which is not maximal, then the idea is to ensure that among the t 1 t 1 t_(1)t_{1}t1 points to which it is connected, at least one has a corresponding rotation of order t 2 > t 1 t 2 > t 1 t_(2) > t_(1)t_{2}>t_{1}t2>t1, and then to repeat. To ensure that one can progress in this way, a critical equation over F p F p F_(p)\mathbb{F}_{p}Fp intervenes:
(3.1) { x + b x = y + 1 y , b 1 x H 1 , y H 2 with H 1 , H 2 subgroups of F p ( or F p 2 ) (3.1) x + b x = y + 1 y , b ≠ 1 x ∈ H 1 , y ∈ H 2  with  H 1 , H 2  subgroups of  F p ∗  or  F p 2 ∗ {:(3.1){[x+(b)/(x)=y+(1)/(y)","quad b!=1],[x inH_(1)","y inH_(2)" with "H_(1)","H_(2)" subgroups of "F_(p)^(**)(" or "F_(p^(2))^(**))]:}:}\left\{\begin{array}{l} x+\frac{b}{x}=y+\frac{1}{y}, \quad b \neq 1 \tag{3.1}\\ x \in H_{1}, y \in H_{2} \text { with } H_{1}, H_{2} \text { subgroups of } \mathbb{F}_{p}^{*}\left(\text { or } \mathbb{F}_{p^{2}}^{*}\right) \end{array}\right.(3.1){x+bx=y+1y,b≠1x∈H1,y∈H2 with H1,H2 subgroups of Fp∗( or Fp2∗)
If t 1 = | H 1 | p 1 / 2 + δ t 1 = H 1 ≥ p 1 / 2 + δ t_(1)=|H_(1)| >= p^(1//2+delta)t_{1}=\left|H_{1}\right| \geq p^{1 / 2+\delta}t1=|H1|≥p1/2+δ (with δ δ delta\deltaδ small and fixed), one can apply the proven Riemann Hypothesis for curves over finite fields [142] to count the number of solutions to (3.1). Together with a simple inclusion/exclusion argument, this shows that one of the t 1 t 1 t_(1)t_{1}t1 points connected to our starting x x xxx has a corresponding maximal rotation and hence x x xxx is connected to ( p ) â„“ ( p ) â„“(p)\ell(p)â„“(p).
If | H 1 | p 1 / 2 + δ H 1 ≤ p 1 / 2 + δ |H_(1)| <= p^(1//2+delta)\left|H_{1}\right| \leq p^{1 / 2+\delta}|H1|≤p1/2+δ then R H R H RH\mathrm{RH}RH for these curves is of little use (their genus is too large), and we have to proceed using other methods. We assume that | H 1 | | H 2 | H 1 ≥ H 2 |H_(1)| >= |H_(2)|\left|H_{1}\right| \geq\left|H_{2}\right||H1|≥|H2| so that the trivial upper bound for the number of solutions to (3.1) is 2 | H 2 | 2 H 2 2|H_(2)|2\left|H_{2}\right|2|H2|. What we need is a power saving in this upper bound in the case that | H 2 | H 2 |H_(2)|\left|H_{2}\right||H2| is close to | H 1 | H 1 |H_(1)|\left|H_{1}\right||H1|, that is, a bound of the form C τ | H 1 | τ C Ï„ H 1 Ï„ C_(tau)|H_(1)|^(tau)C_{\tau}\left|H_{1}\right|^{\tau}CÏ„|H1|Ï„, with τ < 1 , C τ < Ï„ < 1 , C Ï„ < ∞ tau < 1,C_(tau) < oo\tau<1, C_{\tau}<\inftyÏ„<1,CÏ„<∞ (both fixed).
In the prime modulus case, there are several ways to proceed. The first and second methods are related to "elementary" proofs of the Riemann Hypothesis for curves. One can use auxiliary polynomials as in Stepanov's proof [137] of the Riemann Hypothesis for curves to give the desired power saving with an explicit τ Ï„ tau\tauÏ„ (cf. [76] which deals with x + y = 1 x + y = 1 x+y=1x+y=1x+y=1 and | H 1 | = | H 2 | H 1 = H 2 |H_(1)|=|H_(2)|\left|H_{1}\right|=\left|H_{2}\right||H1|=|H2| in (3.1)). The second method, giving the best upper bound, namely 20 max { ( | H 1 | . | H 2 | ) 1 / 3 , | H 1 | . | H 2 | p } 20 max H 1 . H 2 1 / 3 , H 1 . H 2 p 20 max{(|H_(1)|.|H_(2)|)^(1//3),(|H_(1)|.|H_(2)|)/(p)}20 \max \left\{\left(\left|H_{1}\right| .\left|H_{2}\right|\right)^{1 / 3}, \frac{\left|H_{1}\right| .\left|H_{2}\right|}{p}\right\}20max{(|H1|.|H2|)1/3,|H1|.|H2|p}, is due to Corvaja and Zannier [53]. It uses their method for estimating the greatest common divisor of u 1 u − 1 u-1u-1u−1 and v 1 v − 1 v-1v-1v−1 in terms of the degrees of u u uuu and v v vvv and their supports, as well as (hyper) Wronskians.
The third method is based on Szemerédi-Trotter theorem for modular hyperbolas [11], whose proof uses crucially expansion and L 2 L 2 L^(2)L^{2}L2-flattening lemma in S L 2 ( Z / p Z ) S L 2 ( Z / p Z ) SL_(2)(Z//pZ)\mathrm{SL}_{2}(\mathbb{Z} / p \mathbb{Z})SL2(Z/pZ) [16].
Theorem 8. Let Φ : F p Mat 2 ( F p ) Φ : F p → Mat 2 ⁡ F p Phi:F_(p)rarrMat_(2)(F_(p))\Phi: \mathbb{F}_{p} \rightarrow \operatorname{Mat}_{2}\left(\mathbb{F}_{p}\right)Φ:Fp→Mat2⁡(Fp) be such that det Φ Î¦ Phi\PhiΦ does not vanish identically and Im Φ P G L 2 ( F p ) Im ⁡ Φ ∩ P G L 2 F p Im Phi nn PGL_(2)(F_(p))\operatorname{Im} \Phi \cap P G L_{2}\left(\mathbb{F}_{p}\right)Im⁡Φ∩PGL2(Fp) is not contained in a set of the form F p g H F p ∗ â‹… g H F_(p)^(**)*gH\mathbb{F}_{p}^{*} \cdot g HFp∗⋅gH for some g S L 2 ( F p ) g ∈ S L 2 F p g inSL_(2)(F_(p))g \in \mathrm{SL}_{2}\left(\mathbb{F}_{p}\right)g∈SL2(Fp) and H H HHH a proper subgroup of S L 2 ( F p ) S L 2 F p SL_(2)(F_(p))\mathrm{SL}_{2}\left(\mathbb{F}_{p}\right)SL2(Fp). Then the following holds:
Given ε > 0 , r > 1 ε > 0 , r > 1 epsi > 0,r > 1\varepsilon>0, r>1ε>0,r>1, there is δ > 0 δ > 0 delta > 0\delta>0δ>0 such that if A P 1 ( F p ) A ⊂ P 1 F p A subP^(1)(F_(p))A \subset P^{1}\left(\mathbb{F}_{p}\right)A⊂P1(Fp) and L F p L ⊂ F p L subF_(p)L \subset \mathbb{F}_{p}L⊂Fp satisfy
(3.2) 1 | A | < p 1 ε (3.3) log | A | < r log | L | (3.2) 1 ≪ | A | < p 1 − ε (3.3) log ⁡ | A | < r log ⁡ | L | {:[(3.2)1≪|A| < p^(1-epsi)],[(3.3)log |A| < r log |L|]:}\begin{align*} 1 \ll|A| & <p^{1-\varepsilon} \tag{3.2}\\ \log |A| & <r \log |L| \tag{3.3} \end{align*}(3.2)1≪|A|<p1−ε(3.3)log⁡|A|<rlog⁡|L|
then
(3.4) | { ( x , y , t ) A × A × L ; y = τ Φ ( t ) ( x ) } | < | A | 1 δ | L | (3.4) ( x , y , t ) ∈ A × A × L ; y = Ï„ Φ ( t ) ( x ) < | A | 1 − δ | L | {:(3.4)|{(x,y,t)in A xx A xx L;y=tau_(Phi(t))(x)}| < |A|^(1-delta)|L|:}\begin{equation*} \left|\left\{(x, y, t) \in A \times A \times L ; y=\tau_{\Phi(t)}(x)\right\}\right|<|A|^{1-\delta}|L| \tag{3.4} \end{equation*}(3.4)|{(x,y,t)∈A×A×L;y=τΦ(t)(x)}|<|A|1−δ|L|
where for g = ( a b c d ) , τ g ( x ) = a x + b c x + d g = a b c d , Ï„ g ( x ) = a x + b c x + d g=([a,b],[c,d]),tau_(g)(x)=(ax+b)/(cx+d)g=\left(\begin{array}{ll}a & b \\ c & d\end{array}\right), \tau_{g}(x)=\frac{a x+b}{c x+d}g=(abcd),Ï„g(x)=ax+bcx+d.
While producing poor exponents τ Ï„ tau\tauÏ„, this method is robust and works in the generality that the superstrong approximation for S L 2 ( Z / q Z ) S L 2 ( Z / q Z ) SL_(2)(Z//qZ)\mathrm{SL}_{2}(\mathbb{Z} / q \mathbb{Z})SL2(Z/qZ) has been established; in particular,
the analogue of Theorem 8 for Z / p n Z Z / p n Z Z//p^(n)Z\mathbb{Z} / p^{n} \mathbb{Z}Z/pnZ, which follows from expansion in S L 2 ( Z / p n Z ) S L 2 Z / p n Z SL_(2)(Z//p^(n)Z)\mathrm{SL}_{2}\left(\mathbb{Z} / p^{n} \mathbb{Z}\right)SL2(Z/pnZ), established 13 13 ^(13){ }^{13}13 in [16], plays crucial role in the proof of Theorem 4 in [26].
The above leads to a proof of part 1 of Theorem 1. To continue, one needs to deal with t 1 t 1 t_(1)t_{1}t1 which is very small (here | H 1 | = t 1 H 1 = t 1 |H_(1)|=t_(1)\left|H_{1}\right|=t_{1}|H1|=t1 which divides p 2 1 p 2 − 1 p^(2)-1p^{2}-1p2−1 ).
To handle these, we lift to characteristic zero and examine the finite orbits of Γ Î“ Gamma\GammaΓ in X ( Q ¯ ) X ( Q ¯ ) X( bar(Q))X(\overline{\mathbb{Q}})X(Q¯). In fact, by the Chebotarev Density Theorem, a necessary condition for Conjecture 3 to hold is that there are no such orbits other than { 0 } { 0 } {0}\{0\}{0}. Again using the rotations in the conic sections by planes, one finds that any such finite orbit must be among the solutions with t j t j t_(j)t_{j}tj 's roots of unity to
(3.5) ( t 1 + t 1 1 ) 2 + ( t 2 + t 2 1 ) 2 + ( t 3 + t 3 1 ) 2 = ( t 1 + t 1 1 ) ( t 2 + t 2 1 ) ( t 3 + t 3 1 ) (3.5) t 1 + t 1 − 1 2 + t 2 + t 2 − 1 2 + t 3 + t 3 − 1 2 = t 1 + t 1 − 1 t 2 + t 2 − 1 t 3 + t 3 − 1 {:(3.5)(t_(1)+t_(1)^(-1))^(2)+(t_(2)+t_(2)^(-1))^(2)+(t_(3)+t_(3)^(-1))^(2)=(t_(1)+t_(1)^(-1))(t_(2)+t_(2)^(-1))(t_(3)+t_(3)^(-1)):}\begin{equation*} \left(t_{1}+t_{1}^{-1}\right)^{2}+\left(t_{2}+t_{2}^{-1}\right)^{2}+\left(t_{3}+t_{3}^{-1}\right)^{2}=\left(t_{1}+t_{1}^{-1}\right)\left(t_{2}+t_{2}^{-1}\right)\left(t_{3}+t_{3}^{-1}\right) \tag{3.5} \end{equation*}(3.5)(t1+t1−1)2+(t2+t2−1)2+(t3+t3−1)2=(t1+t1−1)(t2+t2−1)(t3+t3−1)
For this particular surface X X XXX, one can show using the inequality between the geometric and arithmetic means, that (3.5) has no nontrivial solutions for complex numbers with | t j | = 1 t j = 1 |t_(j)|=1\left|t_{j}\right|=1|tj|=1. For the more general surfaces X k , S A , B , C , D X k , S A , B , C , D X_(k),S_(A,B,C,D)X_{k}, S_{A, B, C, D}Xk,SA,B,C,D, and those in (1.8), there is a variety of solutions with | t j | = 1 t j = 1 |t_(j)|=1\left|t_{j}\right|=1|tj|=1. However, Lang's G m G m G_(m)\mathbb{G}_{m}Gm Conjecture which is established effectively (see [ 2 , 92 ] [ 2 , 92 ] [2,92][2,92][2,92] ) yields that there are only finitely many solutions to these equations in roots of unity. This allows for an explicit determination of the finite orbits of Γ Y Γ Y Gamma_(Y)\Gamma_{Y}ΓY in Y ( Q ¯ ) Y ( Q ¯ ) Y( bar(Q))Y(\overline{\mathbb{Q}})Y(Q¯) (as noted earlier for the cubic surfaces S A , B , C , D S A , B , C , D S_(A,B,C,D)S_{A, B, C, D}SA,B,C,D, the long list of these orbits [96] correspond to the algebraic Painleve VI's). This Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ analysis leads to part 2 of Theorem 1 and, combined with the discussion above, it yields a proof of Conjecture 3, at least if p 2 1 p 2 − 1 p^(2)-1p^{2}-1p2−1 is not very smooth. To prove Theorem 2, we need to show that there are very few primes for which the above arguments fail. This is done by extending the arguments and results in [43] and [44] concerning points ( x , y ) ( x , y ) (x,y)(x, y)(x,y) on irreducible curves over F p F p F_(p)\mathbb{F}_{p}Fp for which ord ( x ) + ord ( y ) ( x ) + ord ⁡ ( y ) (x)+ord(y)(x)+\operatorname{ord}(y)(x)+ord⁡(y) is small (here ord ( x ) ( x ) (x)(x)(x) is the order of x x xxx in F p F p ∗ F_(p)^(**)\mathbb{F}_{p}^{*}Fp∗ ).
The proof of Theorem 6 in the stronger form that all Markoff numbers are highly composite, that is, for every v 1 v ≥ 1 v >= 1v \geq 1v≥1, as T T → ∞ T rarr ooT \rightarrow \inftyT→∞,
m M s , m T m has at most v distinct prime factors 1 = o ( m M s m T 1 ) ∑ m ∈ M s , m ≤ T m  has at most  v  distinct prime factors    1 = o ∑ m ∈ M s m ≤ T   1 sum_({:[m inM^(s)","m <= T],[m" has at most "],[v" distinct prime factors "]:})1=o(sum_({:[m inM^(s)],[m <= T]:})1)\sum_{\substack{m \in M^{s}, m \leq T \\ m \text { has at most } \\ v \text { distinct prime factors }}} 1=o\left(\sum_{\substack{m \in M^{s} \\ m \leq T}} 1\right)∑m∈Ms,m≤Tm has at most v distinct prime factors 1=o(∑m∈Msm≤T1)
makes use of counting points on X ( Z ) X ∗ ( Z ) X^(**)(Z)X^{*}(\mathbb{Z})X∗(Z) of height at most T T TTT and, in particular, Mirzakhani's orbit equidistribution [111], as well as the transitivity properties of Γ Î“ Gamma\GammaΓ on X ( q ) X ∗ ( q ) X^(**)(q)X^{*}(q)X∗(q) for q q qqq a product of suitable primes p p ppp. The latter are provided by the results of Meiri and Puder [110]. For p 1 ( 4 ) p ≡ 1 ( 4 ) p-=1(4)p \equiv 1(4)p≡1(4) for which the induced permutation action of Γ Î“ Gamma\GammaΓ on X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p) is transitive, they show that the resulting permutation group is essentially the full symmetric or alternating group on X ( p ) X ∗ ( p ) X^(**)(p)X^{*}(p)X∗(p). Applying Goursat's (disjointness) Lemma leads to the Γ Î“ Gamma\GammaΓ-action on X ( p 1 p 2 p k ) X ∗ p 1 p 2 ⋯ p k X^(**)(p_(1)p_(2)cdotsp_(k))X^{*}\left(p_{1} p_{2} \cdots p_{k}\right)X∗(p1p2⋯pk) being transitive for any such primes p 1 < p 2 < < p k p 1 < p 2 < ⋯ < p k p_(1) < p_(2) < cdots < p_(k)p_{1}<p_{2}<\cdots<p_{k}p1<p2<⋯<pk.
FIGURE 3
Level set κ = 2.1 κ = − 2.1 kappa=-2.1\kappa=-2.1κ=−2.1.
FIGURE 4
Level set κ = 1.9 κ = − 1.9 kappa=-1.9\kappa=-1.9κ=−1.9.

4. REAL DYNAMICS ON SURFACES OF MARKOFF TYPE

4.1. In this section we discuss the work of Goldman [73] pertaining to modular group action on real SL(2)-characters of a one-holed torus. The fundamental group π Ï€ pi\piÏ€ of the one-holed torus is the free group of rank two. The mapping class group of the 1-holed torus is isomorphic to the outer automorphism group Out ( π ) GL ( 2 , Z ) Out ⁡ ( Ï€ ) ≅ GL ⁡ ( 2 , Z ) Out(pi)~=GL(2,Z)\operatorname{Out}(\pi) \cong \operatorname{GL}(2, \mathbb{Z})Out⁡(Ï€)≅GL⁡(2,Z) of π Ï€ pi\piÏ€ and acts on the moduli space of equivalence classes of SL ( 2 , C ) SL ⁡ ( 2 , C ) SL(2,C)\operatorname{SL}(2, \mathbb{C})SL⁡(2,C)-representations of π Ï€ pi\piÏ€; this moduli space identifies naturally with affine 3 -space C 3 C 3 C^(3)\mathbb{C}^{3}C3, using the traces of two generators of π Ï€ pi\piÏ€ and of their product as coordinates. In these coordinates, the trace of the commutator of the two generators (representing the boundary curve of the torus) is given by κ ( x , y , z ) = x 2 + y 2 + κ ( x , y , z ) = x 2 + y 2 + kappa(x,y,z)=x^(2)+y^(2)+\kappa(x, y, z)=x^{2}+y^{2}+κ(x,y,z)=x2+y2+ z 2 x y z 2 z 2 − x y z − 2 z^(2)-xyz-2z^{2}-x y z-2z2−xyz−2, which is preserved under the action of Out ( π ) Out ⁡ ( Ï€ ) Out(pi)\operatorname{Out}(\pi)Out⁡(Ï€), and the action of Out ( π ) Out ⁡ ( Ï€ ) Out(pi)\operatorname{Out}(\pi)Out⁡(Ï€) on C 3 C 3 C^(3)\mathbb{C}^{3}C3 is commensurable with the action of the group Γ Î“ Gamma\GammaΓ of polynomial automorphisms of C 3 C 3 C^(3)\mathbb{C}^{3}C3 which preserve κ κ kappa\kappaκ. Figures 3-8 show level sets of κ κ kappa\kappaκ.

FIGURE 5

Level set κ = 1.9 κ = 1.9 kappa=1.9\kappa=1.9κ=1.9.
FIGURE 6
Level set κ = 2.1 κ = 2.1 kappa=2.1\kappa=2.1κ=2.1.
In [73] Goldman studied the dynamics of the Γ Î“ Gamma\GammaΓ-action on the set of real points of this moduli space, and more precisely on the level sets κ 1 ( t ) R 3 κ − 1 ( t ) ∩ R 3 kappa^(-1)(t)nnR^(3)\kappa^{-1}(t) \cap \mathbb{R}^{3}κ−1(t)∩R3, for t R t ∈ R t inRt \in \mathbb{R}t∈R. The action of Γ Î“ Gamma\GammaΓ preserves a Poisson structure defining a Γ Î“ Gamma\GammaΓ-invariant area form on each level set. It is shown that for t < 2 t < 2 t < 2t<2t<2 the Γ Î“ Gamma\GammaΓ-action is properly discontinuous on the four contractible components of each level set and ergodic on the compact component (which is empty if t < 2 t < − 2 t < -2t<-2t<−2 ); the contractible components correspond to Teichmüller spaces of complete hyperbolic structures on a one-holed torus if t 2 t ≤ − 2 t <= -2t \leq-2t≤−2, and of a torus with a single cone point singularity if 2 < t < − 2 < t < -2 < t <-2<t<−2<t< 2. For t = 2 t = 2 t=2t=2t=2, the level set consists of characters of reducible representations and comprises two ergodic components, for 2 < t 18 2 < t ≤ 18 2 < t <= 182<t \leq 182<t≤18 the action of Γ Î“ Gamma\GammaΓ on a level set is ergodic, and for t > 18 t > 18 t > 18t>18t>18 the moduli space contains characters of discrete representations uniformizing a three-holed sphere and the action is ergodic on the complement.
4.2. The main objective of [38] is the dynamical description of elements of the mapping class group of the four-punctured sphere acting on two-dimensional slices of its

FIGURE 7

Level set κ = 2.1 κ = 2.1 kappa=2.1\kappa=2.1κ=2.1.

FIGURE 8

Level set κ = 4 κ = 4 kappa=4\kappa=4κ=4.
character variety. It also contains three striking applications of this analysis to the dynamics of the mapping class group on the character variety, to the spectrum of certain discrete Schrödinger equations, and to Painlevé sixth equation. Cantat considers the space of representations of the free group given by the presentation F 3 = α , β , γ , δ α β γ δ = 1 F 3 = ⟨ α , β , γ , δ ∣ α β γ δ = 1 ⟩ F_(3)=(:alpha,beta,gamma,delta∣alpha beta gamma delta=1:)F_{3}=\langle\alpha, \beta, \gamma, \delta \mid \alpha \beta \gamma \delta=1\rangleF3=⟨α,β,γ,δ∣αβγδ=1⟩ into S L ( 2 , C ) S L ( 2 , C ) SL(2,C)\mathrm{SL}(2, \mathbb{C})SL(2,C) modulo conjugacy. By fixing the trace of the images of the four generators, one obtains a space that is naturally parameterized by a cubic surface S A , B , C , D C 3 S A , B , C , D ⊂ C 3 S_(A,B,C,D)subC^(3)S_{A, B, C, D} \subset \mathbb{C}^{3}SA,B,C,D⊂C3 given by x 2 + y 2 + z 2 + x y z = A x + B y + C z + D x 2 + y 2 + z 2 + x y z = A x + B y + C z + D x^(2)+y^(2)+z^(2)+xyz=Ax+By+Cz+Dx^{2}+y^{2}+z^{2}+x y z=A x+B y+C z+Dx2+y2+z2+xyz=Ax+By+Cz+D for some parameters A , B , C , D C A , B , C , D ∈ C A,B,C,D inCA, B, C, D \in \mathbb{C}A,B,C,D∈C. This surface admits three natural involutions s x , s y , s z s x , s y , s z s_(x),s_(y),s_(z)s_{x}, s_{y}, s_{z}sx,sy,sz which fix two out of the three coordinates and transform the last to the other root of the quadratic. These involutions generate a group Γ Î“ Gamma\GammaΓ of affine automorphisms. Automorphisms of F 3 F 3 F_(3)F_{3}F3 act by composition on the space of representations by preserving the trace, and the group of outer automorphisms of F 3 F 3 F_(3)F_{3}F3 acts on S A , B , C , D S A , B , C , D S_(A,B,C,D)S_{A, B, C, D}SA,B,C,D in such a way that its image contains Γ Î“ Gamma\GammaΓ as a finite index subgroup.
An element f Γ f ∈ Γ f in Gammaf \in \Gammaf∈Γ is called hyperbolic if it corresponds to a pseudo-Anosov automorphism in the mapping class group, or, equivalently, if it is not conjugated to the product of

FIGURE 9

S ( 0.2 , 0.2 , 0.2 , 4.39 ) S ( − 0.2 , − 0.2 , − 0.2 , 4.39 ) S_((-0.2,-0.2,-0.2,4.39))S_{(-0.2,-0.2,-0.2,4.39)}S(−0.2,−0.2,−0.2,4.39).
FIGURE 10
Projection of the stable manifold.
FIGURE 11
S ( 0 , 0 , 0 , 3 ) S ( 0 , 0 , 0 , 3 ) S_((0,0,0,3))S_{(0,0,0,3)}S(0,0,0,3)

FIGURE 12

Projection of the intersection of the stable manifold with the upper part of the surface.

FIGURE 13

S ( 0 , 0 , 0 , 4.1 ) S ( 0 , 0 , 0 , 4.1 ) S_((0,0,0,4.1))S_{(0,0,0,4.1)}S(0,0,0,4.1).
FIGURE 14
Projection of the intersection of the stable manifold with the upper part of the surface.
one or two involutions as above. Take f Γ f ∈ Γ f in Gammaf \in \Gammaf∈Γ of hyperbolic type, and compactify S A , B , C , D S A , B , C , D S_(A,B,C,D)S_{A, B, C, D}SA,B,C,D by taking its closure in P 3 P 3 P^(3)\mathbb{P}^{3}P3. The divisor at infinity is a cycle of three rational curves. By conjugating f f fff in a suitable way, one can make it algebraically stable in the sense of Fornaess and Sibony [61], so that it contracts all curves at infinity to a single superattracting fixed point. One can then prove that at that point the map is locally conjugated to a monomial map whose spectral radius λ ( f ) λ ( f ) lambda(f)\lambda(f)λ(f) is greater than 1 . This enables one to define the Green functions G ± f = lim n λ ( f ) n log + | f ± n | G ± f = lim n   λ ( f ) − n log + ⁡ f ± n G_(+-)f=lim_(n)lambda(f)^(-n)log^(+)|f^(+-n)|G_{ \pm} f=\lim _{n} \lambda(f)^{-n} \log ^{+}\left|f^{ \pm n}\right|G±f=limnλ(f)−nlog+⁡|f±n|, and show that they are plurisubharmonic, continuous, and possess natural invariance properties. It follows that the positive measure μ f = d d c G + f d d c G f μ f = d d c G + f ∧ d d c G − f mu_(f)=dd^(c)G^(+)f^^dd^(c)G^(-)f\mu_{f}=d d^{c} G^{+} f \wedge d d^{c} G^{-} fμf=ddcG+f∧ddcG−f is well defined and f f fff-invariant. Moreover, μ f μ f mu_(f)\mu_{f}μf turns out to be mixing and the unique measure of maximal entropy equal to log λ ( f ) log ⁡ λ ( f ) log lambda(f)\log \lambda(f)log⁡λ(f). All these properties are reminiscent of the dynamics of Hénon mappings in the complex plane, and are proved analogously. Next, assume all coefficients A , B , C , D A , B , C , D A,B,C,DA, B, C, DA,B,C,D are real, and suppose the real part S A , B , C , D ( R ) S A , B , C , D ( R ) S_(A,B,C,D)(R)S_{A, B, C, D}(\mathbb{R})SA,B,C,D(R) is connected (in which case it is homeomorphic to the sphere minus four points). The main theorem of the paper states that the support of the measure μ f μ f mu_(f)\mu_{f}μf is then included in S A , B , C , D ( R ) S A , B , C , D ( R ) S_(A,B,C,D)(R)S_{A, B, C, D}(\mathbb{R})SA,B,C,D(R) and that the induced map on S A , B , C , D ( R ) S A , B , C , D ( R ) S_(A,B,C,D)(R)S_{A, B, C, D}(\mathbb{R})SA,B,C,D(R) is uniformly hyperbolic on its nonwandering set. The proof of this striking theorem uses deep results by Bedford and Smillie [6] on the characterization of nonhyperbolic real Hénon maps having the same entropy as their complexification and relies on a delicate geometrical analysis of the possibilities for the intersection of stable and unstable manifolds in S A , B , C , D ( R ) S A , B , C , D ( R ) S_(A,B,C,D)(R)S_{A, B, C, D}(\mathbb{R})SA,B,C,D(R).

5. AN ASYMPTOTIC FORMULA FOR INTEGER POINTS ON MARKOFF-HURWITZ VARIETIES

For integer parameters n 3 , a 1 n ≥ 3 , a ≥ 1 n >= 3,a >= 1n \geq 3, a \geq 1n≥3,a≥1, and k Z k ∈ Z k inZk \in \mathbb{Z}k∈Z, consider the Diophantine equation
(5.1) x 1 2 + x 2 2 + + x n 2 = a x 1 x 2 x n + k (5.1) x 1 2 + x 2 2 + ⋯ + x n 2 = a x 1 x 2 ⋯ x n + k {:(5.1)x_(1)^(2)+x_(2)^(2)+cdots+x_(n)^(2)=ax_(1)x_(2)cdotsx_(n)+k:}\begin{equation*} x_{1}^{2}+x_{2}^{2}+\cdots+x_{n}^{2}=a x_{1} x_{2} \cdots x_{n}+k \tag{5.1} \end{equation*}(5.1)x12+x22+⋯+xn2=ax1x2⋯xn+k
We call this the generalized 14 14 ^(14){ }^{14}14 Markoff-Hurwitz equation. In this section we count solutions to (5.1) in integers, which we call Markoff-Hurwitz tuples. More precisely, let V V VVV be the affine subvariety of C n C n C^(n)\mathbb{C}^{n}Cn cut out by (5.1). In a joint work with Magee and Ronan [68], we investigated the asymptotic size of the set V ( Z ) B ( R ) V ( Z ) ∩ B ( R ) V(Z)nn B(R)V(\mathbb{Z}) \cap B(R)V(Z)∩B(R) where B ( R ) B ( R ) B(R)B(R)B(R) is the ball of radius R R RRR in the â„“ ∞ â„“^(oo)\ell^{\infty}ℓ∞-norm on R n C n R n ⊂ C n R^(n)subC^(n)\mathbb{R}^{n} \subset \mathbb{C}^{n}Rn⊂Cn. Perhaps somewhat surprisingly, the asymptotic growth for n 4 n ≥ 4 n >= 4n \geq 4n≥4 is not of the order ( log R ) n 1 ( log ⁡ R ) n − 1 (log R)^(n-1)(\log R)^{n-1}(log⁡R)n−1, as was first noticed by Baragar [4], who subsequently in [5] proved that there is a number β = β ( n ) β = β ( n ) beta=beta(n)\beta=\beta(n)β=β(n) such that when k = 0 k = 0 k=0k=0k=0, if V ( Z ) { ( 0 , 0 , , 0 ) } V ( Z ) − { ( 0 , 0 , … , 0 ) } V(Z)-{(0,0,dots,0)}V(\mathbb{Z})-\{(0,0, \ldots, 0)\}V(Z)−{(0,0,…,0)} is nonempty then
(5.2) | V ( Z ) B ( R ) | = ( log R ) β + o ( 1 ) (5.2) | V ( Z ) ∩ B ( R ) | = ( log ⁡ R ) β + o ( 1 ) {:(5.2)|V(Z)nn B(R)|=(log R)^(beta+o(1)):}\begin{equation*} |V(\mathbb{Z}) \cap B(R)|=(\log R)^{\beta+o(1)} \tag{5.2} \end{equation*}(5.2)|V(Z)∩B(R)|=(log⁡R)β+o(1)
as R R → ∞ R rarr ooR \rightarrow \inftyR→∞.
In [5] the following bounds for the exponents β ( n ) β ( n ) beta(n)\beta(n)β(n) were also obtained:
β ( 3 ) = 2 (5.5) β ( 4 ) ( 2.430 , 2.477 ) β ( 5 ) ( 2.730 , 2.798 ) β ( 6 ) ( 2.963 , 3.048 ) β ( 3 ) = 2 (5.5) β ( 4 ) ∈ ( 2.430 , 2.477 ) β ( 5 ) ∈ ( 2.730 , 2.798 ) β ( 6 ) ∈ ( 2.963 , 3.048 ) {:[beta(3)=2],[(5.5)beta(4)in(2.430","2.477)],[beta(5)in(2.730","2.798)],[beta(6)in(2.963","3.048)]:}\begin{align*} & \beta(3)=2 \\ & \beta(4) \in(2.430,2.477) \tag{5.5}\\ & \beta(5) \in(2.730,2.798) \\ & \beta(6) \in(2.963,3.048) \end{align*}β(3)=2(5.5)β(4)∈(2.430,2.477)β(5)∈(2.730,2.798)β(6)∈(2.963,3.048)
and, in general,
log ( n 1 ) log 2 < β ( n ) < log ( n 1 ) log 2 + o ( n 0.58 ) log ⁡ ( n − 1 ) log ⁡ 2 < β ( n ) < log ⁡ ( n − 1 ) log ⁡ 2 + o n − 0.58 (log(n-1))/(log 2) < beta(n) < (log(n-1))/(log 2)+o(n^(-0.58))\frac{\log (n-1)}{\log 2}<\beta(n)<\frac{\log (n-1)}{\log 2}+o\left(n^{-0.58}\right)log⁡(n−1)log⁡2<β(n)<log⁡(n−1)log⁡2+o(n−0.58)
The following problems were posed by Silverman in 1995 [136] (in the setting of k = 0 ) k = 0 ) k=0)k=0)k=0) :
Problem 1. Is there is a true asymptotic formula for | V ( Z ) B ( R ) | | V ( Z ) ∩ B ( R ) | |V(Z)nn B(R)||V(\mathbb{Z}) \cap B(R)||V(Z)∩B(R)| with main term proportional to log ( R ) β log ⁡ ( R ) β log(R)^(beta)\log (R)^{\beta}log⁡(R)β ?
Problem 2. Is β ( n ) β ( n ) beta(n)\beta(n)β(n) irrational?
In [68] a complete answer to Problem 1 was obtained by extending Baragar's exponential rate of growth estimate to a true asymptotic formula. 15 15 ^(15){ }^{15}15
When k > 0 k > 0 k > 0k>0k>0, there are certain exceptional families of solutions to (5.1) that have a different quality of growth and, for fixed k , a , n k , a , n k,a,nk, a, nk,a,n, we write & & &\&& for the set of exceptional tuples. We obtain the following theorem for the asymptotic number of Markoff-Hurwitz tuples:
Theorem 9. For each ( n , a , k ) ( n , a , k ) (n,a,k)(n, a, k)(n,a,k) with V ( Z ) E V ( Z ) − E V(Z)-EV(\mathbb{Z})-\mathcal{E}V(Z)−E infinite, there is a positive constant c = c = c=c=c= c ( n , a , k ) c ( n , a , k ) c(n,a,k)c(n, a, k)c(n,a,k) such that
| ( V ( Z ) E ) B ( R ) | = c ( log R ) β + o ( ( log R ) β ) | ( V ( Z ) − E ) ∩ B ( R ) | = c ( log ⁡ R ) β + o ( log ⁡ R ) β |(V(Z)-E)nn B(R)|=c(log R)^(beta)+o((log R)^(beta))|(V(\mathbb{Z})-\mathcal{E}) \cap B(R)|=c(\log R)^{\beta}+o\left((\log R)^{\beta}\right)|(V(Z)−E)∩B(R)|=c(log⁡R)β+o((log⁡R)β)
Here β β beta\betaβ is the same constant as in (5.2).
After renormalizing (5.1), which allows us to set a = 1 a = 1 a=1a=1a=1, and rearranging entries, Markoff-Hurwitz transformations induce the moves
(5.4) λ j ( z 1 , , z n ) = ( z 1 , , z j ^ , , z n , ( i j z i ) z j ) , 1 j n 1 (5.4) λ j z 1 , … , z n = z 1 , … , z j ^ , … , z n , ∏ i ≠ j   z i − z j , 1 ≤ j ≤ n − 1 {:(5.4)lambda_(j)(z_(1),dots,z_(n))=(z_(1),dots,( widehat(z_(j))),dots,z_(n),(prod_(i!=j)z_(i))-z_(j))","quad1 <= j <= n-1:}\begin{equation*} \lambda_{j}\left(z_{1}, \ldots, z_{n}\right)=\left(z_{1}, \ldots, \widehat{z_{j}}, \ldots, z_{n},\left(\prod_{i \neq j} z_{i}\right)-z_{j}\right), \quad 1 \leq j \leq n-1 \tag{5.4} \end{equation*}(5.4)λj(z1,…,zn)=(z1,…,zj^,…,zn,(∏i≠jzi)−zj),1≤j≤n−1
on ordered tuples of real numbers. Above, ^ ∙ ^ widehat(∙)\widehat{\bullet}∙^ denotes omission. If sufficiently many of the z i z i z_(i)z_{i}zi are large, the move λ j λ j lambda_(j)\lambda_{j}λj can be approximated by
z ( z 1 , , z j ^ , , z n , i j z i ) z ↦ z 1 , … , z j ^ , … , z n , ∏ i ≠ j   z i z|->(z_(1),dots,( widehat(z_(j))),dots,z_(n),prod_(i!=j)z_(i))z \mapsto\left(z_{1}, \ldots, \widehat{z_{j}}, \ldots, z_{n}, \prod_{i \neq j} z_{i}\right)z↦(z1,…,zj^,…,zn,∏i≠jzi)
to high accuracy relative to the largest entries of z z zzz. When the z i z i z_(i)z_{i}zi are positive, at the level of logarithms this corresponds to
( log z 1 , log z 2 , , log z n ) ( log z 1 , , log z j ^ , , log z n , i j log z i ) log ⁡ z 1 , log ⁡ z 2 , … , log ⁡ z n ↦ log ⁡ z 1 , … , log ⁡ z j ^ , … , log ⁡ z n , ∑ i ≠ j   log ⁡ z i (log z_(1),log z_(2),dots,log z_(n))|->(log z_(1),dots,( widehat(log z_(j))),dots,log z_(n),sum_(i!=j)log z_(i))\left(\log z_{1}, \log z_{2}, \ldots, \log z_{n}\right) \mapsto\left(\log z_{1}, \ldots, \widehat{\log z_{j}}, \ldots, \log z_{n}, \sum_{i \neq j} \log z_{i}\right)(log⁡z1,log⁡z2,…,log⁡zn)↦(log⁡z1,…,log⁡zj^,…,log⁡zn,∑i≠jlog⁡zi)
Thus one is naturally led to study the linear semigroup generated by linear maps
(5.5) γ j ( y 1 , y 2 , , y n ) = ( y 1 , , y j ^ , , y n , i j y i ) (5.5) γ j y 1 , y 2 , … , y n = y 1 , … , y j ^ , … , y n , ∑ i ≠ j   y i {:(5.5)gamma_(j)(y_(1),y_(2),dots,y_(n))=(y_(1),dots,( widehat(y_(j))),dots,y_(n),sum_(i!=j)y_(i)):}\begin{equation*} \gamma_{j}\left(y_{1}, y_{2}, \ldots, y_{n}\right)=\left(y_{1}, \ldots, \widehat{y_{j}}, \ldots, y_{n}, \sum_{i \neq j} y_{i}\right) \tag{5.5} \end{equation*}(5.5)γj(y1,y2,…,yn)=(y1,…,yj^,…,yn,∑i≠jyi)
on ordered n n nnn-tuples ( y 1 , , y n ) y 1 , … , y n (y_(1),dots,y_(n))\left(y_{1}, \ldots, y_{n}\right)(y1,…,yn).
Let
Γ = γ 1 , , γ n 1 + Γ = γ 1 , … , γ n − 1 + Gamma=(:gamma_(1),dots,gamma_(n-1):)_(+)\Gamma=\left\langle\gamma_{1}, \ldots, \gamma_{n-1}\right\rangle_{+}Γ=⟨γ1,…,γn−1⟩+
where we have written a " + " to indicate we are generating a semigroup, not a group.
An important idea in [68] that explains why we are able to make progress on the counting problem is that we replace the generators of Γ Î“ Gamma\GammaΓ with the countably infinite generating set T Γ = { γ n 1 A γ j : A Z 0 , 1 j n 2 } T Γ = γ n − 1 A γ j : A ∈ Z ≥ 0 , 1 ≤ j ≤ n − 2 T_(Gamma)={gamma_(n-1)^(A)gamma_(j):A inZ_( >= 0),1 <= j <= n-2}T_{\Gamma}=\left\{\gamma_{n-1}^{A} \gamma_{j}: A \in \mathbb{Z}_{\geq 0}, 1 \leq j \leq n-2\right\}TΓ={γn−1Aγj:A∈Z≥0,1≤j≤n−2} and then consider the semigroup Γ = T Γ + Γ ′ = T Γ + Gamma^(')=(:T_(Gamma):)_(+)\Gamma^{\prime}=\left\langle T_{\Gamma}\right\rangle_{+}Γ′=⟨TΓ⟩+.
Both Γ Î“ Gamma\GammaΓ and Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ preserve the nonnegative ordered hyperplane
(5.6) H { ( y 1 , , y n ) R 0 n : y 1 y 2 y n , j = 1 n 1 y j = y n } R 0 n (5.6) H ≡ y 1 , … , y n ∈ R ≥ 0 n : y 1 ≤ y 2 ≤ ⋯ ≤ y n , ∑ j = 1 n − 1   y j = y n ⊂ R ≥ 0 n {:(5.6)H-={(y_(1),dots,y_(n))inR_( >= 0)^(n):y_(1) <= y_(2) <= cdots <= y_(n),sum_(j=1)^(n-1)y_(j)=y_(n)}subR_( >= 0)^(n):}\begin{equation*} \mathscr{H} \equiv\left\{\left(y_{1}, \ldots, y_{n}\right) \in \mathbf{R}_{\geq 0}^{n}: y_{1} \leq y_{2} \leq \cdots \leq y_{n}, \sum_{j=1}^{n-1} y_{j}=y_{n}\right\} \subset \mathbf{R}_{\geq 0}^{n} \tag{5.6} \end{equation*}(5.6)H≡{(y1,…,yn)∈R≥0n:y1≤y2≤⋯≤yn,∑j=1n−1yj=yn}⊂R≥0n
any element of Γ Î“ Gamma\GammaΓ maps ordered tuples in R 0 n R ≥ 0 n R_( >= 0)^(n)\mathbf{R}_{\geq 0}^{n}R≥0n into H H H\mathscr{H}H. Therefore the study of orbits of Γ Î“ Gamma\GammaΓ and Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ on ordered tuples boils down to the study of orbits in H H H\mathscr{H}H. We can use the basis
e j = ( 0 , , 0 , 1 j , 0 , , 0 , 1 ) e j = ( 0 , … , 0 , 1 ⏟ j , 0 , … , 0 , 1 ) e_(j)=(0,dots,0,ubrace(1ubrace)_(j),0,dots,0,1)e_{j}=(0, \ldots, 0, \underbrace{1}_{j}, 0, \ldots, 0,1)ej=(0,…,0,1⏟j,0,…,0,1)
for the subspace spanned by H H H\mathscr{H}H. This basis clarifies the action of Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′.
When n = 3 n = 3 n=3n=3n=3, the linear map σ : H H σ : H → H sigma:HrarrH\sigma: \mathscr{H} \rightarrow \mathscr{H}σ:H→H defined by
(5.7) σ ( a , b , a + b ) = order ( b a , a , b ) (5.7) σ ( a , b , a + b ) = order ⁡ ( b − a , a , b ) {:(5.7)sigma(a","b","a+b)=order(b-a","a","b):}\begin{equation*} \sigma(a, b, a+b)=\operatorname{order}(b-a, a, b) \tag{5.7} \end{equation*}(5.7)σ(a,b,a+b)=order⁡(b−a,a,b)
where order puts a tuple in ascending order from left to right, is such that for j = 1 , 2 j = 1 , 2 j=1,2j=1,2j=1,2 we have σ γ j . y = y σ γ j . y = y sigmagamma_(j).y=y\sigma \gamma_{j} . y=yσγj.y=y for all y H y ∈ H y inHy \in \mathscr{H}y∈H. Repeatedly applying the map σ σ sigma\sigmaσ to a triple ( a , b , a + b ) ( a , b , a + b ) (a,b,a+b)(a, b, a+b)(a,b,a+b) with a b Z a ≤ b ∈ Z a <= b inZa \leq b \in \mathbb{Z}a≤b∈Z performs the Euclidean algorithm on a , b a , b a,ba, ba,b. However, one application of σ σ sigma\sigmaσ corresponds in general to less than one step of the algorithm. Replacing Γ Î“ Gamma\GammaΓ with Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ corresponds to speeding this up so one whole step of the Euclidean algorithm corresponds to one semigroup generator. As for counting, the orbit of ( 0 , 1 , 1 ) ( 0 , 1 , 1 ) (0,1,1)(0,1,1)(0,1,1) under Γ Î“ Gamma\GammaΓ is precisely those ( a , b , a + b ) ( a , b , a + b ) (a,b,a+b)(a, b, a+b)(a,b,a+b) with ( a , b ) = 1 ( a , b ) = 1 (a,b)=1(a, b)=1(a,b)=1 and thus can be counted by elementary methods.
When n = 3 n = 3 n=3n=3n=3, the semigroup Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ is generated by
g A := γ 2 A γ 1 = ( 0 1 1 A + 1 ) g A := γ 2 A γ 1 = 0 1 1 A + 1 g_(A):=gamma_(2)^(A)gamma_(1)=([0,1],[1,A+1])g_{A}:=\gamma_{2}^{A} \gamma_{1}=\left(\begin{array}{cc} 0 & 1 \\ 1 & A+1 \end{array}\right)gA:=γ2Aγ1=(011A+1)
with respect to the basis { e 1 , e 2 } e 1 , e 2 {e_(1),e_(2)}\left\{e_{1}, e_{2}\right\}{e1,e2}. These generators are classically connected with continued fractions by the formulae
( 0 1 1 A 1 ) ( 0 1 1 A 2 ) ( 0 1 1 A k ) = ( b d ) , b d = 1 A 1 + 1 A 2 + 1 A k 0 1 1 A 1 0 1 1 A 2 ⋯ 0 1 1 A k = ⋆ b ⋆ d , b d = 1 A 1 + 1 A 2 + ⋱ ⋅ 1 A k ([0,1],[1,A_(1)])([0,1],[1,A_(2)])cdots([0,1],[1,A_(k)])=([***,b],[***,d]),quad(b)/(d)=(1)/(A_(1)+(1)/(A_(2)+ddots*(1)/(A_(k))))\left(\begin{array}{cc} 0 & 1 \\ 1 & A_{1} \end{array}\right)\left(\begin{array}{cc} 0 & 1 \\ 1 & A_{2} \end{array}\right) \cdots\left(\begin{array}{cc} 0 & 1 \\ 1 & A_{k} \end{array}\right)=\left(\begin{array}{cc} \star & b \\ \star & d \end{array}\right), \quad \frac{b}{d}=\frac{1}{A_{1}+\frac{1}{A_{2}+\ddots \cdot \frac{1}{A_{k}}}}(011A1)(011A2)⋯(011Ak)=(⋆b⋆d),bd=1A1+1A2+⋱⋅1Ak
FIGURE 15
When n = 4 n = 4 n=4n=4n=4, the semigroup elements map Δ = H / R + Δ = H / R + Delta=H//R+\Delta=\mathbb{H} / \mathbb{R}+Δ=H/R+ into a strictly smaller subset. After iteration, this leads to more and more empty space (see also Figure 16). This does not occur when n = 3 n = 3 n=3n=3n=3, as one can also see from the picture: the action of the group elements γ 2 γ 2 gamma_(2)\gamma_{2}γ2 and γ 3 γ 3 gamma_(3)\gamma_{3}γ3 on the vertical coordinate axis is a copy of the n = 3 n = 3 n=3n=3n=3 dynamics
When n = 4 n = 4 n=4n=4n=4, the semigroup Γ Î“ Gamma\GammaΓ acts in the basis given by the e i e i e_(i)e_{i}ei as
γ 1 = ( 0 1 0 0 0 1 1 1 1 ) , γ 2 = ( 1 0 0 0 0 1 1 1 1 ) , γ 3 = ( 1 0 0 0 1 0 1 1 1 ) γ 1 = 0 1 0 0 0 1 1 1 1 , γ 2 = 1 0 0 0 0 1 1 1 1 , γ 3 = 1 0 0 0 1 0 1 1 1 gamma_(1)=([0,1,0],[0,0,1],[1,1,1]),quadgamma_(2)=([1,0,0],[0,0,1],[1,1,1]),quadgamma_(3)=([1,0,0],[0,1,0],[1,1,1])\gamma_{1}=\left(\begin{array}{lll} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 1 & 1 \end{array}\right), \quad \gamma_{2}=\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 1 & 1 & 1 \end{array}\right), \quad \gamma_{3}=\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 1 & 1 \end{array}\right)γ1=(010001111),γ2=(100001111),γ3=(100010111)
This semigroup appears naturally in different areas of mathematics. In most situations that this semigroup appears, as is the case in [68], the dynamics of the projective linear action of Γ Î“ Gamma\GammaΓ on R + 3 / R + R + 3 / R + R_(+)^(3)//R_(+)\mathbb{R}_{+}^{3} / \mathbb{R}_{+}R+3/R+becomes relevant. Up to the minor modification of possibly multiplying the generators on the left or right by permutation matrices, the iterated function system given by the projective linear action of Γ Î“ Gamma\GammaΓ on R + 3 / R + R + 3 / R + R_(+)^(3)//R_(+)\mathbb{R}_{+}^{3} / \mathbf{R}_{+}R+3/R+has a fractal attracting set that is known as the Rauzy gasket [95].
So the semigroups Γ Î“ Gamma\GammaΓ and Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ are natural extensions of the Euclidean algorithm and continued fractions semigroup to higher dimensions. Writing Δ = H / R + Δ = H / R + Delta=H//R_(+)\Delta=\mathscr{H} / \mathbb{R}_{+}Δ=H/R+, we can view Δ Î” Delta\DeltaΔ as a subset of R n 2 R n − 2 R^(n-2)\mathbb{R}^{n-2}Rn−2. The key distinction that appears when n 4 n ≥ 4 n >= 4n \geq 4n≥4 is that
Δ j = 1 n 1 γ j ( Δ ) Δ ≠ ⋃ j = 1 n − 1   γ j ( Δ ) Delta!=uuu_(j=1)^(n-1)gamma_(j)(Delta)\Delta \neq \bigcup_{j=1}^{n-1} \gamma_{j}(\Delta)Δ≠⋃j=1n−1γj(Δ)
and so the induced dynamics on H / R + H / R + H//R_(+)\mathscr{H} / \mathbb{R}_{+}H/R+has "holes" as we illustrate in Figure 15.
Structure of the proof and the difficulties that arise. Here we highlight some of the main difficulties that must be overcome during the proof of Theorem 9. It is illuminating to recall the methods used by Lalley 16 16 ^(16){ }^{16}16 in [90] where the action of a Schottky subgroup G G GGG
of S L 2 ( R ) S L 2 ( R ) SL_(2)(R)\mathrm{SL}_{2}(\mathbf{R})SL2(R) on the hyperbolic upper half-plane H H H\mathbb{H}H is considered. Lalley obtains that, for any x H x ∈ H x inHx \in \mathbb{H}x∈H, the number N ( x , r ) N ( x , r ) N(x,r)\mathcal{N}(x, r)N(x,r) of elements γ γ gamma\gammaγ of G G GGG such that
d H ( i , γ x ) d H ( i , x ) r d H ( i , γ x ) − d H ( i , x ) ≤ r d_(H)(i,gamma x)-d_(H)(i,x) <= rd_{\mathbb{H}}(i, \gamma x)-d_{\mathbb{H}}(i, x) \leq rdH(i,γx)−dH(i,x)≤r
where d H d H d_(H)d_{\mathbb{H}}dH is hyperbolic distance, satisfies N ( x , r ) C e δ r N ( x , r ) ≈ C e δ r N(x,r)~~Ce^(delta r)\mathcal{N}(x, r) \approx C e^{\delta r}N(x,r)≈Ceδr, where δ = δ ( G ) δ = δ ( G ) delta=delta(G)\delta=\delta(G)δ=δ(G) is the Hausdorff dimension of the limit set of G G GGG, and C = C ( G , x ) > 0 C = C ( G , x ) > 0 C=C(G,x) > 0C=C(G, x)>0C=C(G,x)>0. Lalley's proof incorporates at various stages the following arguments:
Shell argument. By repeated application of a "renewal equation," the quantity N ( x , r ) N ( x , r ) N(x,r)\mathcal{N}(x, r)N(x,r) is related to a sum of N ( y , r ) N y , r ′ N(y,r^('))\mathcal{N}\left(y, r^{\prime}\right)N(y,r′), where the sum is over y y yyy on a shell of radius c r ≈ c r ~~cr\approx c r≈cr in a Cayley tree of G G GGG, and r r ′ r^(')r^{\prime}r′ is a translate of r r rrr that corrects for the passage between x x xxx and y y yyy. The purpose of this shell argument is that now, the points y y yyy lie close to H ∂ H delH\partial \mathbb{H}∂H.
Passage to the boundary. Each of the resulting N ( y , r ) N y , r ′ N(y,r^('))\mathcal{N}\left(y, r^{\prime}\right)N(y,r′) is compared to an analogous quantity N ( y , r ) N ∗ y ∗ , r ′ N^(**)(y^(**),r^('))\mathcal{N}^{*}\left(y^{*}, r^{\prime}\right)N∗(y∗,r′) where y y ∗ y^(**)y^{*}y∗ is a point in H ∂ H delH\partial \mathbb{H}∂H close to y y yyy. Because each y y yyy is close to H ∂ H delH\partial \mathbb{H}∂H, the errors incurred are acceptable.
Transfer operator techniques. Asymptotic formulas for N ( y , r ) N ∗ y ∗ , r ′ N^(**)(y^(**),r^('))\mathcal{N}^{*}\left(y^{*}, r^{\prime}\right)N∗(y∗,r′) are obtained using the renewal method and spectral estimates for transfer operators. This gives asymptotic formulas for N ( y , r ) N y , r ′ N(y,r^('))\mathcal{N}\left(y, r^{\prime}\right)N(y,r′). The main terms of the asymptotic formulas satisfy recursive relationships between different y y yyy.
Recombination. One finally has to recombine all the asymptotic formulas obtained for N ( y , r ) N y , r ′ N(y,r^('))\mathcal{N}\left(y, r^{\prime}\right)N(y,r′) to obtain an asymptotic formula for N ( x , r ) N ( x , r ) N(x,r)\mathcal{N}(x, r)N(x,r). This is done using the recursive formulas obtained in the previous step.
Trying to follow the method outlined above for this orbital counting problem, we first need a suitable replacement for H ∂ H delH\partial \mathbb{H}∂H. Our idea is to use the projectivization of the hyperplane H H H\mathscr{H}H; we call this set Δ Î” Delta\DeltaΔ. We compare points in the orbit of Λ Î› Lambda\LambdaΛ (generated by λ j λ j lambda_(j)\lambda_{j}λj in (5.4) to points in Δ Î” Delta\DeltaΔ by taking logarithms of all coordinates and then projectivizing. This process does not necessarily lead to a point in Δ Î” Delta\DeltaΔ; there is an important parameter α ( z ) = j = 1 n 2 z j α ( z ) = ∏ j = 1 n − 2   z j alpha(z)=prod_(j=1)^(n-2)z_(j)\alpha(z)=\prod_{j=1}^{n-2} z_{j}α(z)=∏j=1n−2zj that appears throughout the paper and measures how good the fit is. If α ( z ) α ( z ) alpha(z)\alpha(z)α(z) is large, then one can, in analogy with Lalley's setting, think of z z zzz as being "close to the boundary."
For Lalley, the word length of γ γ gamma\gammaγ is roughly proportional to the quantity d H ( i , γ x ) d H ( i , γ x ) − d_(H)(i,gamma x)-d_{\mathbb{H}}(i, \gamma x)-dH(i,γx)− d H ( i , x ) d H ( i , x ) d_(H)(i,x)d_{\mathbb{H}}(i, x)dH(i,x) with respect to which he counts. This implies, during the shell argument, that all the elements of the shell are roughly the same distance from H ∂ H delH\partial \mathbb{H}∂H. However, for us, there are arbitrarily long words in the generators of Λ Î› Lambda\LambdaΛ for which α ( z ) α ( z ) alpha(z)\alpha(z)α(z) is small. We solve this problem using "acceleration," by replacing Λ Î› Lambda\LambdaΛ by Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′, and instead aim to follow Lalley's argument for orbits of Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′. This has the immediate benefit that we can guarantee that elements z z zzz of shells of radius L L LLL, with respect to Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′, have large α ( z ) α ( z ) alpha(z)\alpha(z)α(z), if we make L L LLL appropriately large.
However, the acceleration also has some costs to be paid. The first issue arising is that now Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′ has countably many generators, so shells for word length on Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′ are not finite. Instead of using shells, we use intersections of shells with the elements of the Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′-orbit whose coordinates are not too large. The second issue is that the original Λ Î› Lambda\LambdaΛ-orbit breaks up
FIGURE 16
In the same setting ( n = 4 ) ( n = 4 ) (n=4)(n=4)(n=4) of Figure 15 , we show in black the images of Δ Î” Delta\DeltaΔ under the action of all words of length 10 in the generators { γ 1 , γ 2 , γ 3 } γ 1 , γ 2 , γ 3 {gamma_(1),gamma_(2),gamma_(3)}\left\{\gamma_{1}, \gamma_{2}, \gamma_{3}\right\}{γ1,γ2,γ3}.
into countably many Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′-orbits. So we not only have to perform the recombination argument for Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′, but then have to perform an extra summation over the countably many Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′-orbits.
After setting up our shell argument appropriately, we must perform the passage to the boundary (i.e., Δ Î” Delta\DeltaΔ ). To this end, we compare orbits of Λ Î› ′ Lambda^(')\Lambda^{\prime}Λ′ to orbits of Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′, where Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ is the linear semigroup. To get this to work, we must exploit the following "shadowing" feature of the map log log log\loglog that takes logarithms of all entries of a vector. It says (roughly) that if log ( z ) log ⁡ ( z ) log(z)\log (z)log⁡(z) is within ϵ ϵ epsilon\epsilonϵ of y H y ∈ H y inHy \in \mathscr{H}y∈H, with ϵ ϵ epsilon\epsilonϵ on the scale of α ( z ) 2 α ( z ) − 2 alpha(z)^(-2)\alpha(z)^{-2}α(z)−2, then for all λ Λ , log ( λ ( z ) ) λ ∈ Λ ′ , log ⁡ ( λ ( z ) ) lambda inLambda^('),log(lambda(z))\lambda \in \Lambda^{\prime}, \log (\lambda(z))λ∈Λ′,log⁡(λ(z)) is within ϵ ϵ epsilon\epsilonϵ of γ ( log ( z ) ) γ ( log ⁡ ( z ) ) gamma(log(z))\gamma(\log (z))γ(log⁡(z)), where γ Γ Î³ ∈ Γ ′ gamma inGamma^(')\gamma \in \Gamma^{\prime}γ∈Γ′ is matched with λ λ lambda\lambdaλ in a natural way.
The completion of the proof relies on spectral estimates for transfer operators associated to the projective linear action of Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ on Δ Î” Delta\DeltaΔ. There are three key issues arising here. First, to obtain the spectral estimates we need, we must establish that the action of Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ on Δ Î” Delta\DeltaΔ is uniformly contracting; it is important to note that this argument would not work if the acceleration had not been performed previously. Secondly, we need to establish that the relevant "log-Jacobian" cocycle over the dynamical system is not cohomologous to a lattice cocycle. Finally, but importantly, we must obtain spectral estimates for transfer operators acting on C 1 ( Δ ) C 1 ( Δ ) C^(1)(Delta)C^{1}(\Delta)C1(Δ) which is accomplished by adapting Liverani's approach to spectral esti-
mates from [97]. See section 4 of [68], and references therein, for the discussion of Gauss map and Gauss measure [ 70 , 89 [ 70 , 89 [70,89[70,89[70,89 ] in this context.
The question of whether β β beta\betaβ is irrational (Problem 2) remains a tantalizing open question, and one may wonder whether it is even algebraic. Our methods do give some partial insight into the nature of this mysterious number in terms of the action of Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′ on H / R + H / R + H//R_(+)\mathscr{H} / \mathbf{R}_{+}H/R+.
Theorem 10. The number β β beta\betaβ is the unique parameter in ( 1 , ) ( 1 , ∞ ) (1,oo)(1, \infty)(1,∞) such that there exists a probability measure ν β ν β nu_(beta)\nu_{\beta}νβ on Δ = H / R + Δ = H / R + Delta=H//R_(+)\Delta=\mathscr{H} / \mathbf{R}_{+}Δ=H/R+with the property
w Δ f ( w ) d v β ( w ) = γ T Γ w Δ f ( γ w ) | Jac w ( γ ) | β n 1 d v β ( w ) ∫ w ∈ Δ   f ( w ) d v β ( w ) = ∑ γ ∈ T Γ   ∫ w ∈ Δ   f ( γ â‹… w ) Jac w ⁡ ( γ ) β n − 1 d v β ( w ) int_(w in Delta)f(w)dv_(beta)(w)=sum_(gamma inT_(Gamma))int_(w in Delta)f(gamma*w)|Jac_(w)(gamma)|^((beta)/(n-1))dv_(beta)(w)\int_{w \in \Delta} f(w) d v_{\beta}(w)=\sum_{\gamma \in T_{\Gamma}} \int_{w \in \Delta} f(\gamma \cdot w)\left|\operatorname{Jac}_{w}(\gamma)\right|^{\frac{\beta}{n-1}} d v_{\beta}(w)∫w∈Δf(w)dvβ(w)=∑γ∈TΓ∫w∈Δf(γ⋅w)|Jacw⁡(γ)|βn−1dvβ(w)
for all f C 0 ( Δ ) f ∈ C 0 ( Δ ) f inC^(0)(Delta)f \in C^{0}(\Delta)f∈C0(Δ). We call ν β ν β nu_(beta)\nu_{\beta}νβ a conformal measure.
Theorem 10 can be viewed as a partial analog of the connection between the exponent of growth of a finitely generated Fuchsian group and the Hausdorff dimension of its limit set as a result of Patterson-Sullivan theory [117,138,139]. In our setting, the lack of any symmetric space means the parameter β β beta\betaβ is not in any obvious way connected to the Hausdorff dimension of the compact Γ Î“ ′ Gamma^(')\Gamma^{\prime}Γ′-invariant subset of Δ Î” Delta\DeltaΔ.
The issue of the existence of a single integral solution for general a a aaa and k k kkk is very subtle, even for n = 3 n = 3 n=3n=3n=3, as discussed in the next section.

6. HASSE PRINCIPLE ON SURFACES OF MARKOFF TYPE

Little is known about the values at integers assumed by affine cubic forms 17 F 17 F ^(17)F{ }^{17} F17F in three variables. For k 0 k ≠ 0 k!=0k \neq 0k≠0, set
(6.1) V k , F = { x = ( x 1 , x 2 , x 3 ) : F ( x ) = k } (6.1) V k , F = x = x 1 , x 2 , x 3 : F ( x ) = k {:(6.1)V_(k,F)={x=(x_(1),x_(2),x_(3)):F(x)=k}:}\begin{equation*} V_{k, F}=\left\{\mathbf{x}=\left(x_{1}, x_{2}, x_{3}\right): F(\mathbf{x})=k\right\} \tag{6.1} \end{equation*}(6.1)Vk,F={x=(x1,x2,x3):F(x)=k}
The basic question is for which k k kkk is V k , F ( Z ) V k , F ( Z ) ≠ ∅ V_(k,F)(Z)!=O/V_{k, F}(\mathbb{Z}) \neq \emptysetVk,F(Z)≠∅, or, more generally, infinite or Zariski dense in V k , F V k , F V_(k,F)V_{k, F}Vk,F ?
A prime example is F = S F = S F=SF=SF=S, the sum of three cubes,
(6.2) S ( x 1 , x 2 , x 3 ) = x 1 3 + x 2 3 + x 3 3 (6.2) S x 1 , x 2 , x 3 = x 1 3 + x 2 3 + x 3 3 {:(6.2)S(x_(1),x_(2),x_(3))=x_(1)^(3)+x_(2)^(3)+x_(3)^(3):}\begin{equation*} S\left(x_{1}, x_{2}, x_{3}\right)=x_{1}^{3}+x_{2}^{3}+x_{3}^{3} \tag{6.2} \end{equation*}(6.2)S(x1,x2,x3)=x13+x23+x33
There are obvious local congruence obstructions, namely that V k , S ( Z ) = V k , S ( Z ) = ∅ V_(k,S)(Z)=O/V_{k, S}(\mathbb{Z})=\emptysetVk,S(Z)=∅ if k 4 , 5 ( mod 9 ) k ≡ 4 , 5 ( mod 9 ) k-=4,5(mod9)k \equiv 4,5(\bmod 9)k≡4,5(mod9), but beyond that, it is possible that the answers to all three questions are yes for all the other k k kkk 's, which we call the admissible values (see [50,113]). It is known that strong approximation in its strongest form fails for V k , S ( Z ) V k , S ( Z ) V_(k,S)(Z)V_{k, S}(\mathbb{Z})Vk,S(Z); the global obstruction coming from an application of cubic reciprocity [41,49,75]). Moreover, the authors of [94] and [7] show that V 1 , S ( Z ) V 1 , S ( Z ) V_(1,S)(Z)V_{1, S}(\mathbb{Z})V1,S(Z) is Zariski dense in V 1 , S V 1 , S V_(1,S)V_{1, S}V1,S.
In [71] Ghosh and Sarnak investigate the Markoff form F = M F = M F=MF=MF=M,
(6.3) M ( x ) = x 1 2 + x 2 2 + x 3 2 x 1 x 2 x 3 (6.3) M ( x ) = x 1 2 + x 2 2 + x 3 2 − x 1 x 2 x 3 {:(6.3)M(x)=x_(1)^(2)+x_(2)^(2)+x_(3)^(2)-x_(1)x_(2)x_(3):}\begin{equation*} M(\mathbf{x})=x_{1}^{2}+x_{2}^{2}+x_{3}^{2}-x_{1} x_{2} x_{3} \tag{6.3} \end{equation*}(6.3)M(x)=x12+x22+x32−x1x2x3
17 By an affine form f f fff in n n nnn variables we mean f Z [ x 1 , , x n ] f ∈ Z x 1 , … , x n f inZ[x_(1),dots,x_(n)]f \in \mathbb{Z}\left[x_{1}, \ldots, x_{n}\right]f∈Z[x1,…,xn] whose leading homogeneous term f 0 f 0 f_(0)f_{0}f0 is nondegenerate and such that f k f − k f-kf-kf−k is (absolutely) irreducible for all constants k k kkk.

FIGURE 17

Lattice points and fundamental set for k = 3685 k = 3685 k=3685k=3685k=3685.
FIGURE 18
Closeup of fundamental set for k = 3685 k = 3685 k=3685k=3685k=3685
Except for the case of the Cayley cubic with k = 4 , V k ; M ( Z ) k = 4 , V k ; M ( Z ) k=4,V_(k;M)(Z)k=4, V_{k ; M}(\mathbb{Z})k=4,Vk;M(Z) decomposes into a finite number h M ( k ) h M ( k ) h_(M)(k)\mathfrak{h}_{M}(k)hM(k) of Γ Î“ Gamma\GammaΓ-orbits. For example, if k = 0 k = 0 k=0k=0k=0, then h M ( k ) = 2 h M ( k ) = 2 h_(M)(k)=2\mathfrak{h}_{M}(k)=2hM(k)=2 corresponds to the orbits of ( 0 , 0 , 0 ) ( 0 , 0 , 0 ) (0,0,0)(0,0,0)(0,0,0) and ( 3 , 3 , 3 ) ( 3 , 3 , 3 ) (3,3,3)(3,3,3)(3,3,3). In order to study h M ( k ) h M ( k ) h_(M)(k)\mathfrak{h}_{M}(k)hM(k) both theoretically and numerically, they give an explicit reduction (descent) for the action of Γ Î“ Gamma\GammaΓ on V k , M ( Z ) V k , M ( Z ) V_(k,M)(Z)V_{k, M}(\mathbb{Z})Vk,M(Z). For this purpose, it is convenient to remove an explicit set of special admissible k k kkk 's, namely those for which there is a point in V k , M ( Z ) V k , M ( Z ) V_(k,M)(Z)V_{k, M}(\mathbb{Z})Vk,M(Z) with | x j | = 0 , 1 x j = 0 , 1 |x_(j)|=0,1\left|x_{j}\right|=0,1|xj|=0,1 or 2 . These k k kkk 's take the form (i) k = u 2 + v 2 k = u 2 + v 2 k=u^(2)+v^(2)k=u^{2}+v^{2}k=u2+v2, (ii) 4 ( k 1 ) = u 2 + 3 v 2 4 ( k − 1 ) = u 2 + 3 v 2 4(k-1)=u^(2)+3v^(2)4(k-1)=u^{2}+3 v^{2}4(k−1)=u2+3v2, or (iii) k = 4 + u 2 k = 4 + u 2 k=4+u^(2)k=4+u^{2}k=4+u2. The number of these special k k kkk 's (referred to as exceptional) with 0 k K 0 ≤ k ≤ K 0 <= k <= K0 \leq k \leq K0≤k≤K is asymptotic to C K log K C ′ K log ⁡ K C^(')(K)/(sqrt(log K))C^{\prime} \frac{K}{\sqrt{\log K}}C′Klog⁡K. The remaining admissible k k kkk 's are called generic (all negative admissible k k kkk 's are generic). For them Ghosh and Sarnak give the following elegant reduced forms:

FIGURE 19

Lattice points and fundamental set for k = 3691 k = − 3691 k=-3691k=-3691k=−3691.
FIGURE 20
Closeup of fundamental set for k = 3691 k = − 3691 k=-3691k=-3691k=−3691.
Proposition 11. (1) Let k 5 k ≥ 5 k >= 5k \geq 5k≥5 be generic and consider the compact set
F k + = { u R 3 : 3 u 1 u 2 u 3 , u 1 2 + u 2 2 + u 3 2 + u 1 u 2 u 3 = k } F k + = u ∈ R 3 : 3 ≤ u 1 ≤ u 2 ≤ u 3 , u 1 2 + u 2 2 + u 3 2 + u 1 u 2 u 3 = k F_(k)^(+)={uinR^(3):3 <= u_(1) <= u_(2) <= u_(3),u_(1)^(2)+u_(2)^(2)+u_(3)^(2)+u_(1)u_(2)u_(3)=k}\mathfrak{F}_{k}^{+}=\left\{\mathbf{u} \in \mathbb{R}^{3}: 3 \leq u_{1} \leq u_{2} \leq u_{3}, u_{1}^{2}+u_{2}^{2}+u_{3}^{2}+u_{1} u_{2} u_{3}=k\right\}Fk+={u∈R3:3≤u1≤u2≤u3,u12+u22+u32+u1u2u3=k}
The points in F k + ( Z ) = F k + Z 3 F k + ( Z ) = F k + ∩ Z 3 F_(k)^(+)(Z)=F_(k)^(+)nnZ^(3)\mathfrak{F}_{k}^{+}(\mathbb{Z})=\mathfrak{F}_{k}^{+} \cap \mathbb{Z}^{3}Fk+(Z)=Fk+∩Z3 are Γ Î“ Gamma\GammaΓ-inequivalent, and any x V k , M ( Z ) x ∈ V k , M ( Z ) xinV_(k,M)(Z)\mathbf{x} \in V_{k, M}(\mathbb{Z})x∈Vk,M(Z) is Γ Î“ Gamma\GammaΓ-equivalent to a unique point u = ( u 1 , u 2 , u 3 ) u ′ = − u 1 , u 2 , u 3 u^(')=(-u_(1),u_(2),u_(3))\mathbf{u}^{\prime}=\left(-u_{1}, u_{2}, u_{3}\right)u′=(−u1,u2,u3) with u = ( u 1 , u 2 , u 3 ) u = u 1 , u 2 , u 3 ∈ u=(u_(1),u_(2),u_(3))in\mathbf{u}=\left(u_{1}, u_{2}, u_{3}\right) \inu=(u1,u2,u3)∈ F k + ( Z ) F k + ( Z ) F_(k)^(+)(Z)\mathfrak{F}_{k}^{+}(\mathbb{Z})Fk+(Z).
(2) Let k < 0 k < 0 k < 0k<0k<0 be admissible and consider the compact set
F k = { u R 3 : 3 u 1 u 2 u 3 1 2 u 1 u 2 , u 1 2 + u 2 2 + u 3 2 u 1 u 2 u 3 = k } F k − = u ∈ R 3 : 3 ≤ u 1 ≤ u 2 ≤ u 3 ≤ 1 2 u 1 u 2 , u 1 2 + u 2 2 + u 3 2 − u 1 u 2 u 3 = k F_(k)^(-)={uinR^(3):3 <= u_(1) <= u_(2) <= u_(3) <= (1)/(2)u_(1)u_(2),u_(1)^(2)+u_(2)^(2)+u_(3)^(2)-u_(1)u_(2)u_(3)=k}\mathfrak{F}_{k}^{-}=\left\{\mathbf{u} \in \mathbb{R}^{3}: 3 \leq u_{1} \leq u_{2} \leq u_{3} \leq \frac{1}{2} u_{1} u_{2}, u_{1}^{2}+u_{2}^{2}+u_{3}^{2}-u_{1} u_{2} u_{3}=k\right\}Fk−={u∈R3:3≤u1≤u2≤u3≤12u1u2,u12+u22+u32−u1u2u3=k}
The points in F k ( Z ) = F k Z 3 F k − ( Z ) = F k − ∩ Z 3 F_(k)^(-)(Z)=F_(k)^(-)nnZ^(3)\mathfrak{F}_{k}^{-}(\mathbb{Z})=\mathfrak{F}_{k}^{-} \cap \mathbb{Z}^{3}Fk−(Z)=Fk−∩Z3 are Γ Î“ Gamma\GammaΓ-inequivalent, and any x V k , M ( Z ) x ∈ V k , M ( Z ) xinV_(k,M)(Z)\mathbf{x} \in V_{k, M}(\mathbb{Z})x∈Vk,M(Z) is Γ Î“ Gamma\GammaΓ-equivalent to a unique point u = ( u 1 , u 2 , u 3 ) F k ( Z ) u = u 1 , u 2 , u 3 ∈ F k − ( Z ) u=(u_(1),u_(2),u_(3))inF_(k)^(-)(Z)\mathbf{u}=\left(u_{1}, u_{2}, u_{3}\right) \in \mathfrak{F}_{k}^{-}(\mathbb{Z})u=(u1,u2,u3)∈Fk−(Z).
Some consequences of this are as follows: As k ± k → ± ∞ k rarr+-ook \rightarrow \pm \inftyk→±∞, we have
h M ( k ) ε | k | 1 3 + ε . h M ( k ) ≪ ε | k | 1 3 + ε .  h_(M)(k)≪_(epsi)|k|^((1)/(3)+epsi)". "\mathfrak{h}_{M}(k) \ll_{\varepsilon}|k|^{\frac{1}{3}+\varepsilon} \text {. }hM(k)≪ε|k|13+ε. 
This follows from the fact that when considering the values taken by the corresponding indefinite quadratic form in the y y yyy and z z zzz variables, for each fixed x x xxx, the units are bounded in number due to the restrictions imposed by the fundamental sets.
Let h M ± ( k ) = | F k ± ( Z ) | h M ± ( k ) = F k ± ( Z ) h_(M)^(+-)(k)=|F_(k)^(+-)(Z)|\mathfrak{h}_{M}^{ \pm}(k)=\left|\mathfrak{F}_{k}^{ \pm}(\mathbb{Z})\right|hM±(k)=|Fk±(Z)| where ± = sgn ( k ) ± = sgn ⁡ ( k ) +-=sgn(k)\pm=\operatorname{sgn}(k)±=sgn⁡(k), this being defined for any k k kkk. Then for generic k , h M ± ( k ) = h M ( k ) k , h M ± ( k ) = h M ( k ) k,h_(M)^(+-)(k)=h_(M)(k)k, \mathfrak{h}_{M}^{ \pm}(k)=\mathfrak{h}_{M}(k)k,hM±(k)=hM(k) while otherwise h M ( k ) h M ± ( k ) h M ( k ) ≤ h M ± ( k ) h_(M)(k) <= h_(M)^(+-)(k)\mathfrak{h}_{M}(k) \leq \mathfrak{h}_{M}^{ \pm}(k)hM(k)≤hM±(k). We have
(6.4) k 4 | k | K h M ± ( k ) C ± K ( log K ) 2 (6.4) ∑ k ≠ 4 | k | ≤ K   h M ± ( k ) ∼ C ± K ( log ⁡ K ) 2 {:(6.4)sum_({:[k!=4],[|k| <= K]:})h_(M)^(+-)(k)∼C^(+-)K(log K)^(2):}\begin{equation*} \sum_{\substack{k \neq 4 \\|k| \leq K}} \mathfrak{h}_{M}^{ \pm}(k) \sim C^{ \pm} K(\log K)^{2} \tag{6.4} \end{equation*}(6.4)∑k≠4|k|≤KhM±(k)∼C±K(log⁡K)2
where C ± > 0 C ± > 0 C^(+-) > 0C^{ \pm}>0C±>0 and K K → ∞ K rarr ooK \rightarrow \inftyK→∞.
So on average the numbers h M ( k ) h M ( k ) h_(M)(k)\mathfrak{h}_{M}(k)hM(k) are small. The explicit fundamental domains allow for the numerical computations; these indicate that
(6.5) 0 < k K k admissible h M ( k ) = 0 1 C 0 K θ (6.5) ∑ 0 < k ≤ K k  admissible  h M ( k ) = 0   1 ∼ C 0 K θ {:(6.5)sum_({:[0 < k <= K],[k" admissible "],[h_(M)(k)=0]:})1∼C_(0)K^(theta):}\begin{equation*} \sum_{\substack{0<k \leq K \\ k \text { admissible } \\ \mathfrak{h}_{M}(k)=0}} 1 \sim C_{0} K^{\theta} \tag{6.5} \end{equation*}(6.5)∑0<k≤Kk admissible hM(k)=01∼C0Kθ
with C 0 > 0 C 0 > 0 C_(0) > 0C_{0}>0C0>0 and θ 0.8875 θ ≈ 0.8875 … theta~~0.8875 dots\theta \approx 0.8875 \ldotsθ≈0.8875…
The main result in [71] concerns the values assumed by M M MMM and the Hasse failures in (6.5):
Theorem 12. (i) There are infinitely many Hasse failures. More precisely, the number of 0 < k K 0 < k ≤ K 0 < k <= K0<k \leq K0<k≤K and K k < 0 − K ≤ k < 0 -K <= k < 0-K \leq k<0−K≤k<0 for which the Hasse principle fails is at least K ( log K ) 1 4 K ( log ⁡ K ) − 1 4 sqrtK(log K)^(-(1)/(4))\sqrt{K}(\log K)^{-\frac{1}{4}}K(log⁡K)−14 for K K KKK large.
(ii) Fix t 0 t ≥ 0 t >= 0t \geq 0t≥0. Then as K K → ∞ K rarr ooK \rightarrow \inftyK→∞,
# { | k | K : k admissible, h M ( k ) = 0 } = o ( K ) # | k | ≤ K : k  admissible,  h M ( k ) = 0 = o ( K ) #{|k| <= K:k" admissible, "h_(M)(k)=0}=o(K)\#\left\{|k| \leq K: k \text { admissible, } \mathfrak{h}_{M}(k)=0\right\}=o(K)#{|k|≤K:k admissible, hM(k)=0}=o(K)
Hasse failures are produced by an obstruction via quadratic reciprocity. They come in two types: one via direct use of reciprocity and the second also incorporating the descent group. Recently Colliot-Thélène, Wei, and Xu [48] and, independently, Loughran and Mitankin [98] have shown that the obstruction of the first (but not the second type) can be explained in terms of integral Brauer-Manin obstruction. For example, if k = 4 + 2 v 2 k = 4 + 2 v 2 k=4+2v^(2)k=4+2 v^{2}k=4+2v2, with v v vvv having all of its prime factors congruent to ± 1 ( mod 8 ) ± 1 ( mod 8 ) +-1(mod8)\pm 1(\bmod 8)±1(mod8) and v v vvv congruent to 0 , ± 3 , ± 4 ( mod 9 ) 0 , ± 3 , ± 4 ( mod 9 ) 0,+-3,+-4(mod9)0, \pm 3, \pm 4(\bmod 9)0,±3,±4(mod9), then k k kkk is admissible but V k , M ( Z ) = V k , M ( Z ) = ∅ V_(k,M)(Z)=O/V_{k, M}(\mathbb{Z})=\emptysetVk,M(Z)=∅.
Part (ii) of the theorem is proved by comparing the number of points on V k , M ( Z ) V k , M ( Z ) V_(k,M)(Z)V_{k, M}(\mathbb{Z})Vk,M(Z) in certain tentacled regions gotten by special plane sections, with the expected number of solutions according to a product of local densities; the crucial point being that the variance of this comparison goes to zero on averaging | k | K | k | ≤ K |k| <= K|k| \leq K|k|≤K. As detailed in [71], this moving plane quadric method applies to more general cubic surfaces including those that do not carry morphisms.

ACKNOWLEDGMENTS

I would like to thank Jean Bourgain, Emmanuel Breuillard, Michael Magee, Igor Pak, Ryan Ronan, and Peter Sarnak. My various collaborations with them form the core of this report.
Figure 1 is courtesy of Matthew de Courcy-Ireland.
Figure 2 is courtesy of Elena Fuchs.
Figures 3-8 are courtesy of William Goldman.
Figures 9 14 9 − 14 9-149-149−14 are courtesy of Serge Cantat.
Figures 17-20 are courtesy of Amit Ghosh.

FUNDING

The author was supported, in part, by NSF award DMS-1603715.

REFERENCES

[1] M. Aigner, Markov's Theorem and 100 years of the Uniqueness Conjecture. Springer, 2013.
[2] I. Aliev and C. Smyth, Solving Algebraic equations in roots of unity. Forum Math. 24 (2012), 641-645.
[3] A. Baragar, The Markoff equation and equations of Hurwitz. PhD Thesis, Brown University, RI, 1991.
[4] A. Baragar, Asymptotic growth of Markoff-Hurwitz numbers. Compos. Math. 94 (1994), no. 1, 1-18.
[5] A. Baragar, The exponent for the Markoff-Hurwitz equations. Pacific J. Math. 182 (1998), no. 1, 1-21.
[6] E. Bedford and J. Smillie, Real polynomial diffeomorphisms with maximal entropy: Tangencies. Ann. of Math. (2) 160 (2004), no. 1, 1-26.
[7] F. Beukers, Ternary form equations. J. Number Theory 54 (1995), 113-133.
[8] E. Bombieri, Continued fractions and the Markoff tree. Expo. Math. 25 (2007), no. 3, 187-213.
[9] J. Bourgain, On the Erdös-Volkmann and Katz-Tao ring conjecture, Geom. Funct. Anal. 13 (2003), 334-365.
[10] J. Bourgain, The sum-product theorem in Z q Z q Z_(q)\mathbb{Z}_{q}Zq with q q qqq arbitrary. J. Anal. Math. 106, 1 93 1 − 93 1-931-931−93 (2008)
[11] J. Bourgain, A modular Szemeredi-Trotter theorem for hyperbolas. C. R. Acad. Sci. Paris Sér. 1350 (2012), 793-796.
[12] J. Bourgain and A. Gamburd, New results on expanders. C. R. Math. Acad. Sci. Paris 342 (2006).
[13] J. Bourgain and A. Gamburd, Uniform expansion bounds for Cayley graphs of S L 2 ( F p ) S L 2 F p SL_(2)(F_(p))S L_{2}\left(\mathbb{F}_{p}\right)SL2(Fp). Ann. of Math. 167 (2008), 625-642.
[14] J. Bourgain and A. Gamburd, Random walks and expansion in S L d ( Z / p n Z ) S L d Z / p n Z SL_(d)(Z//p^(n)Z)\mathrm{SL}_{d}\left(\mathbb{Z} / p^{n} \mathbb{Z}\right)SLd(Z/pnZ). C. R. Math. Acad. Sci. Paris 346 (2008), no. 11-12, 619-623.
[15] J. Bourgain and A. Gamburd, On the spectral gap for finitely-generated subgroups of SU(2). Invent. Math. 171 (2008), no. 1, 83-121.
[16] J. Bourgain and A. Gamburd, Expansion and random walks in S L d ( Z / p n Z ) S L d Z / p n Z SL_(d)(Z//p^(n)Z)\mathrm{SL}_{d}\left(\mathbb{Z} / p^{n} \mathbb{Z}\right)SLd(Z/pnZ). I. J. Eur. Math. Soc. (JEMS) 10 (2008), no. 4, 987-1011.
[17] J. Bourgain and A. Gamburd, Expansion and random walks in S L d ( Z / p n Z ) S L d Z / p n Z SL_(d)(Z//p^(n)Z)\mathrm{SL}_{d}\left(\mathbb{Z} / p^{n} \mathbb{Z}\right)SLd(Z/pnZ). II. With an appendix by Bourgain. J. Eur. Math. Soc. (JEMS) 11 (2009), no. 5, 1057 1103 1057 − 1103 1057-11031057-11031057−1103.
[18] J. Bourgain and A. Gamburd, Spectral gaps in SU ( SU ⁡ ( SU(\operatorname{SU}(SU⁡( d ). C. R. Math. Acad. Sci. Paris 348 (2010), no. 11-12, 609-611.
[19] J. Bourgain and A. Gamburd, A spectral gap theorem in SU ( d ) SU ⁡ ( d ) SU(d)\operatorname{SU}(d)SU⁡(d). J. Eur. Math. Soc. (JEMS) 14 (2012), no. 5, 1455-1511.
[20] J. Bourgain, A. Gamburd, and P. Sarnak, Sieving and expanders. C. R. Acad. Sci. Paris, Sér. I 343 (2006).
[21] J. Bourgain, A. Gamburd, and P. Sarnak, Affine linear sieve, expanders and sum product. Invent. Math. 179 (2010), 559-644.
[22] J. Bourgain, A. Gamburd, and P. Sarnak, Generalization of Selberg's 3 16 3 16 (3)/(16)\frac{3}{16}316 theorem and affine sieve. Acta Math. 207 (2011), no. 2, 255-290.
[23] J. Bourgain, A. Gamburd, and P. Sarnak, Markoff surfaces and strong approximation: 1. 2016, arXiv:1607.01530.
[24] J. Bourgain, A. Gamburd, and P. Sarnak, Markoff triples and strong approximation. C. R. Math. Acad. Sci. Paris 354 (2016), no. 2, 131-135.
[25] J. Bourgain, A. Gamburd, and P. Sarnak, Strong approximation and diophantine properties of Markoff numbers, preprint.
[26] J. Bourgain, A. Gamburd, and P. Sarnak, Strong approximation for varieties of Markoff type, preprint.
[27] J. Bourgain, N. Katz, and T. Tao, A sum-product estimate in finite fields, and applications. Geom. Funct. Anal. 14 (2004), no. 1, 27-57.
[28] J. Bourgain and P. P. Varjú, Expansion in S L d ( Z / q Z ) , q S L d ( Z / q Z ) , q SL_(d)(Z//qZ),qS L_{d}(Z / q Z), qSLd(Z/qZ),q arbitrary. Invent. Math. 188 (2012).
[29] D. W. Boyd, The disk-packing constant. Aequationes Math. 7 (1971), 182-193.
[30] D. W. Boyd, Improved bounds for the disk-packing constant. Aequationes Math. 9 (1973), 99-106.
[31] D. W. Boyd, The sequence of radii of the Apollonian packing. Math. Comp. 39 (1982), no. 159, 249-254.
[32] E. Breuillard, A strong Tits alternative. 2008, arXiv:0804.1395.
[33] E. Breuillard, Approximate subgroups and superstrong approximation. In Groups St Andrews 2013, pp. 1-50, London Math. Soc. Lecture Note Ser. 422, Cambridge University Press, Cambridge, 2015.
[34] E. Breuillard and A. Gamburd, Strong uniform expansion in S L 2 ( F p ) S L 2 F p SL_(2)(F_(p))S L_{2}\left(\mathbb{F}_{p}\right)SL2(Fp). Geom. Funct. Anal. 20 (2010), no. 5, 1201-1209.
[35] E. Breuillard, B. Green, and T. Tao, Approximate subgroups of linear groups. Geom. Funct. Anal. 21 (2011).
[36] E. Breuillard and H. Oh, Thin groups and superstrong approximation. Math. Sci. Res. Inst. Publ. 61, Cambridge University Press, Cambridge, 2014.
[37] S. Cantat, Dynamique des automorphismes des surfaces K3. Acta Math. 187 (2001), no. 1, 1-57
[38] S. Cantat, Bers, Hénon, Painlevé and Schrödinger. Duke Math. J. 149 (2009), no. 3, 411-460.
[39] J. W. S. Cassels, The Markoff chain. Ann. of Math. (2) 50 (1949), 676-685.
[40] J. W. S. Cassels, An Introduction to Diophantine Approximation. Cambridge Tracts in Mathematics and Mathematical Physics 45, Cambridge University Press, New York, 1957, x+166 pp.
[41] J. W. S. Cassels, A note on the Diophantine equation x 3 + y 3 + z 3 = 3 x 3 + y 3 + z 3 = 3 x^(3)+y^(3)+z^(3)=3x^{3}+y^{3}+z^{3}=3x3+y3+z3=3. Math. Comp. 44 (1985), 265-266.
[42] F. Celler, C. R. Leedham-Green, S. Murray, A. Niemeyer, and E. A. O'Brien, Generating random elements of a finite group. Comm. Algebra 23 (1995), 4931-4948.
[43] M.-C. Chang, Elements of large order in prime finite fields. Bull. Aust. Math. Soc. 88 (2013).
[44] M.-C. Chang, B. Kerr, I. Shparlinski, and U. Zannier, Elements of large orders on varieties over prime finite fields. J. Théor. Nombres Bordeaux 26 (2014).
[45] W. Chen, Nonabelian level structures, Nielsen equivalence, and Markoff triples. 2021, arXiv:2011.12940.
[46] H. Cohn, Approach to Markoff's minimal forms through modular functions. Ann. of Math. 61 (1955), 1-12.
[47] H. Cohn, Markoff forms and primitive words. Math. Ann. 196 (1972), 8-22.
[48] J.-L. Colliot-Thélène, D. Wei, and F. Xu, Brauer-Manin obstruction for Markoff surfaces. Annali della Scuola Normale Superiore di Pisa - Classe di Scienze Volume XXI/2021, no. 3 série V (2021).
[49] J. Colliot-Thélène and O. Wittenberg, Groupe de Brauer et points entiers de deux familles de surfaces cubiques affines. Amer. J. Math. 134 (2012), no. 5, 1303 1327 1303 − 1327 1303-13271303-13271303−1327.
[50] W. Conn and L. N. Vaserstein, On sums of three integral cubes. pp. 285-294, Contemp. Math. 166, Amer. Math. Soc., 1994.
[51] P. Corvaja and U. Zannier, On integral points on surfaces. Ann. of Math. (2) 160 (2004).
[52] P. Corvaja and U. Zannier, On the greatest prime factor of Markov pairs. Rend. Semin. Mat. Univ. Padova 116 (2006), 253-260.
[53] P. Corvaja and U. Zannier, Greatest common divisors of u 1 , v 1 u − 1 , v − 1 u-1,v-1u-1, v-1u−1,v−1 in positive characteristic and rational points on curves over finite fields. J. Eur. Math. Soc. (JEMS) 15 (2013).
[54] M. de Courcy-Ireland, Non-planarity of Markoff graphs mod p. 2022, arXiv:2105.12411v3.
[55] M. de Courcy-Ireland and S. Lee, Experiments with the Markoff surface. Exp. Math. (2020). DOI 10.1080/10586458.2019.1702123.
[56] B. Dubrovin and M. Mazzocco, Monodromy of certain Painlevé-VI transcendents and reflection groups. Invent. Math. 141 (2000), 55-147.
[57] W. Duke, Hyperbolic distribution problems and half-integral weight Maass forms. Invent. Math. 92 (1988).
[58] M. Einsiedler, E. Lindenstrauss, P. Michel, and A. Venkatesh, Distribution of periodic torus orbits on homogeneous spaces. Duke Math. J. 148 (2009).
[59] M. Einsiedler, E. Lindenstrauss, P. Michel, and A. Venkatesh, The distribution of closed geodesics on the modular surface, and Duke's theorem. Enseign. Math. (2) 58 (2012).
[60] M. H. El-Huti, Cubic surfaces of Markov type. Math. USSR, Sb. 22 (1974), no. 3, 333-348.
[61] J. Fornaess and N. Sibony, Complex dynamics in higher dimension. II. In: Modern methods in complex analysis (Princeton, NJ, 1992). Ann. of Math. Stud. 137, Princeton University Press, Princeton, NJ, 1995.
[62] G. Frobenius, Über die Markoffschen Zahlen. Akad. Wiss. Berlin (1913), 458-487.
[63] E. Fuchs, Counting problems in Apollonian packings. Bull. Amer. Math. Soc. (N.S.) 50 (2013), no. 2, 229-266.
[64] E. Fuchs, K. Lauter, M. Litman, A. Tran, A Cryptographic Hash Function from Markoff Triples. 2021, arXiv:2107.10906.
[65] E. Fuchs, M. Litman, J. Silverman, A. Tran, Orbits on K3 Surfaces of Markoff Type. 2022, arXiv:2201.12588.
[66] A. Gamburd, Spectral gap for infinite index "congruence" subgroups of S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z). Israel J. Math. 127 (2002).
[67] A. Gamburd, Singular adventures of Baron Bourgain in the labyrinth of the continuum. Notices Amer. Math. Soc. 67 (2020), no. 11, 1716-1733.
[68] A. Gamburd, M. Magee, and R. Ronan, An asymptotic formula for integer points on Markoff-Hurwitz varieties. Ann. of Math. (2) 190 (2019), no. 3, 751-809.
[69] A. Gamburd and I. Pak, Expansion of product replacement graphs. Combinatorica 26 (2006), no. 4, 411-429.
[70] C. F. Gauss, Brief an Laplace vom 30 Jan. 1812, Werke X1, 1812, pp. 371-374.
[71] A. Ghosh and P. Sarnak, Integral points on Markoff type cubic surfaces. 2022, arXiv:1706.06712v1.
[72] R. Gilman, Finite quotients of the automorphism group of a free group. Canad. J. Math. 29 (1977), 541-551.
[73] W. Goldman, The modular group action on real S L ( 2 ) S L ( 2 ) SL(2)S L(2)SL(2)-characters of a one-holed torus. Geom. Topol. 7 (2003), 443-486.
[74] D. S. Gorshkov, Geometry of Lobachevskii in connection with certain questions of arithmetic. Zap. Nauch. Sem. Lenin. Otd. Math. Inst. V.A. Steklova AN SSSR 67 (1977), 39-85. English translation in J. Sov. Math. 16 (1981) 788-820.
[75] D. R. Heath-Brown, The density of zeros of forms for which weak approximation fails. Math. Comp. 59 (1992), no. 200, 613-623.
[76] R. Heath-Brown and S. Konyagin, New bounds for Gauss sums derived from k k kkk-th powers and for Heilbronn's exponential sum. Quart. J. Math. (2000), 221-235.
[77] H. A. Helfgott, Growth and generation in S L 2 ( Z / p Z ) S L 2 ( Z / p Z ) SL_(2)(Z//pZ)\mathrm{SL}_{2}(\mathbb{Z} / p \mathbb{Z})SL2(Z/pZ). Ann. of Math. (2) 167 (2008), no. 2, 601-623.
[78] H. A. Helfgott, Growth in groups: ideas and perspectives. Bull. Amer. Math. Soc. (N.S.) 52 (2015), no. 3, 357-413.
[79] C. Hooley, On Artin's conjecture. J. Reine Angew. Math. 225 (1967).
[80] C. Hooley, Applications of sieve methods to the theory of numbers. Cambridge Tracts in Math. 70, Cambridge, 1976.
[81] S. Hoory, N. Linial, and A. Wigderson, Expander graphs and their applications. Bull. Am. Math. Soc. (2006).
[82] E. Hrushovski, Stable group theory and approximate subgroups. J. Amer. Math. Soc. 25 (2012).
[83] A. Hurwitz, Über eine Aufgabe der unbestimmten Analysis. Arch. Math. Phys. 11 (1907), no. 3, 185-196.
[84] M. Kaluba, D. Kielak, and P. W. Nowak, On property (T) for Aut ( F n ) Aut ⁡ F n Aut(F_(n))\operatorname{Aut}\left(F_{n}\right)Aut⁡(Fn) and S L n ( Z ) S L n ( Z ) SL_(n)(Z)\mathrm{SL}_{n}(\mathbb{Z})SLn(Z). Ann. of Math. (2) 193 (2021), no. 2, 539-562.
[85] A. Kontorovich and J. Lagarias, On the expected number of prime factors in the affine sieve with toral Zariski closure, Exp. Math. 30 (2021), no. 4, 575-586.
[86] A. Kontorovich and H. Oh, Apollonian circle packings and closed horospheres on hyperbolic 3-manifolds. J. Amer. Math. Soc. 24 (2011), no. 3, 603-648. With an appendix by Oh and Nimish Shah.
[87] S. V. Konyagin, S. V. Makarychev, I. E. Shparlinski, and I. V. Vyugin, On the structure of graphs of Markoff triples. Q. J. Math. 71 (2020), no. 2, 637-648.
[88] M. A. Korkine and G. Zolotareff, Sur les formes quadratiques. Math. Ann. 6 (1873), 279-284.
[89] R. O. Kuzmin, On a problem of Gauss. Atti Congr. Int. Mat., Bologna 6 (1932), 83-89.
[90] S. P. Lalley, Renewal theorems in symbolic dynamics, with applications to geodesic flows, non-Euclidean tessellations and their fractal limits. Acta Math. 163 (1989), no. 1-2, 1-55.
[91] M. J. Larsen and R. Pink, Finite subgroups of algebraic groups. J. Amer. Math. Soc. 24 (2011).
[92] M. Laurent, Exponential diophantine equations. C. R. Acad. Sci. 296 (1983), 945-947.
[93] P. D. Lax and R. S. Phillips, The asymptotic distribution of lattice points in Euclidean and non-Euclidean spaces. J. Funct. Anal. 46 (1982), no. 3, 280-350.
[94] D. H. Lehmer, On the Diophantine equation x 3 + y 3 + z 3 = 1 x 3 + y 3 + z 3 = 1 x^(3)+y^(3)+z^(3)=1x^{3}+y^{3}+z^{3}=1x3+y3+z3=1. J. Lond. Math. Soc. s1-31 (1956), no. 3, 275-280.
[95] G. Levitt, La dynamique des pseudogroupes de rotations. Invent. Math. 113 (1993), no. 3, 633-670.
[96] O. Lisovyy and Y. Tykhyy, Algebraic solutions of the sixth Painlevé equation. J. Geom. Phys. 85 (2014), 124-163.
[97] C. Liverani, Decay of correlations. Ann. of Math. (2) 142 (1995), no. 2, 239-301.
[98] D. Loughran and V. Mitankin, Integral Hasse principle and strong approximation for Markoff surfaces. Int. Math. Res. Not. IMRN 18 (2021), 14086-14122.
[99] A. Lubotzky, Cayley graphs: Eigenvalues, expanders and random walks. In Surveys in Combinatorics, edited by P. Rowbinson, London Math. Soc. Lecture Note Ser. 218, Cambridge University Press, 1995.
[100] A. Lubotzky and I. Pak, The product replacement algorithm and Kazhdan's property (T). J. Amer. Math. Soc. 14 (2001).
[101] A. Lubotzky and B. Weiss, Groups and Expanders. In DIMACS Series in Disc. Math. and Theor. Comp. Sci. 10, edited by J. Friedman, pp. 95-109, 1993.
[102] A. Markoff, Sur les formes quadratiques binaires indéfinies. Math. Ann. 15 (1879)
[103] A. Markoff, Sur les formes quadratiques binaires indéfinies. Math. Ann. 17 (1880),
[104] C. R. Matthews, Counting points modulo p p ppp for some finitely generated subgroups of algebraic groups. Bull. Lond. Math. Soc. 14 (1982), 149-154.
[105] C. Matthews, L. Vaserstein, and B. Weisfeiler, Congruence properties of Zariski dense groups. Proc. Lond. Math. Soc. 48 (1984), 514-532.
[106] Mazur, Barry The topology of rational points. Experiment. Math. 1 (1992), no. 1, 35-45.
[107] D. McCullough and M. Wanderley, Nielsen equivalence of generating pairs in SL(2,q). Glasg. Math. J. 55 (2013), 481-509.
[108] C. T. McMullen, Dynamics on K3 surfaces: Salem numbers and Siegel disks J. Reine Angew. Math. 545 (2002), 201-233
[109] C. T. McMullen, Automorphisms of projective K3 surfaces with minimum entropy. Invent. Math. 203 (2016), no. 1, 179-215
[110] C. Meiri and D. Puder, The Markoff group of transformations in prime and composite moduli. Duke Math. J. 167 (2018), no. 14, 2679-2720.
[111] M. Mirzakhani, Counting mapping class group orbits on hyperbolic surfaces. 2016, arXiv:1601.03342v1.
[112] L. J. Mordell, On the integer solutions of the equation x 2 + y 2 + z 2 + 2 x y z = n x 2 + y 2 + z 2 + 2 x y z = n x^(2)+y^(2)+z^(2)+2xyz=nx^{2}+y^{2}+z^{2}+2 x y z=nx2+y2+z2+2xyz=n. J. Lond. Math. Soc. 28 (1953), 500-510.
[113] L. J. Mordell, Diophantine equations. Academic Press, London, New York, 1969.
[114] H. Oh, Dynamics on geometrically finite hyperbolic manifolds with applications to Apollonian circle packings and beyond. In Proceedings of the International Congress of Mathematicians III, pp. 1308-1331, Hindustan Book Agency, New Delhi, 2010.
[115] H. Oh and D. Winter, Uniform exponential mixing and resonance free regions for convex cocompact congruence subgroups of S L 2 ( Z ) S L 2 ( Z ) SL_(2)(Z)\mathrm{SL}_{2}(\mathbb{Z})SL2(Z). J. Amer. Math. Soc., 29 (2016), 1069-1115
[116] N. Ozawa, Aut( F 5 ) F 5 {:F_(5))\left.F_{5}\right)F5) has property (T). J. Inst. Math. Jussieu 15 (2016), no. 1, 85-90.
[117] S. J. Patterson, The limit set of a Fuchsian group. Acta Math. 136 (1976), no. 3-4, 241-273.
[118] V. Platonov and A. Rapinchuk, Algebraic groups and number theory. Pure Appl. Math. 139, Academic Press Inc., Boston, MA, 1994.
[119] M. Pollicott, Zeta functions for Anosov flows. In Proceedings of the International Congress of Mathematicians-Seoul 2014. Vol. III, pp. 661-681, Kyung Moon Sa, Seoul, 2014.
[120] L. Pyber and E. Szabo, Growth in finite simple groups of Lie type. J. Amer. Math. Soc. 29 (2016), no. 1, 95-146.
[121] R. Remak, Über indefinite binare quadratische Minimalformen. Math. Ann. 92 (1924), 155-182.
[122] A. Salehi Golsefidy, Super-approximation, I: p p p\mathfrak{p}p-adic semisimple case. Int. Math. Res. Not. IMRN 23 (2017), 7190-7263.
[123] A. Salehi Golsefidy and P. Sarnak, The affine sieve. J. Acad. Mark. Sci. 4 (2013), 1085 1105 1085 − 1105 1085-11051085-11051085−1105.
[124] A. Salehi Golsefidy and P. P. Varju, Expansion in perfect groups. Geom. Funct. Anal. 22 (2012).
[125] P. Sarnak, What is an expander? Not. Amer. Math. Soc. 51 (2004), 762-763.
[126] P. Sarnak, Affine sieve lecture slides, 2010, http://publications.ias.edu/sarnak/ paper/508.
[127] P. Sarnak, Integral Apollonian packings. Amer. Math. Monthly 118 (2011), no. 4, 291-306.
[128] P. C. Sarnak, Diophantine problems and linear groups. In Proceedings of the International Congress of Mathematicians, Vol. I, II (Kyoto, 1990), pp. 459-471, Math. Soc. Japan, Tokyo, 1991.
[129] P. Sarnak and X. X. Xue, Bounds for multiplicities of automorphic representations. Duke Math. J. 64 (1991), no. 1, 207-227.
[130] H. Schwartz and H. T. Muhly, On a class of cubic Diophantine equations. J. Lond. Math. Soc. 32 (1957), 379-382.
[131] A. Selberg, On the estimation of Fourier coefficients of modular forms. Proc. Sympos. Pure Math. VII (1965), 1-15.
[132] C. Series, The geometry of Markoff numbers. Math. Intelligencer 7 (1985), no. 3, 20-29.
[133] Y. Shalom, The algebraization of Kazhdan's property (T). In International Congress of Mathematicians. Vol. II, pp. 1283-1310, Eur. Math. Soc., Zürich, 2006.
[134] J. H. Silverman, The Markoff equation X 2 + Y 2 + Z 2 = a X Y Z X 2 + Y 2 + Z 2 = a X Y Z X^(2)+Y^(2)+Z^(2)=aXYZX^{2}+Y^{2}+Z^{2}=a X Y ZX2+Y2+Z2=aXYZ over quadratic imaginary fields. J. Number Theory 35 (1990), no. 1, 72-104.
[135] J. H. Silverman, Rational points on K3 surfaces: a new canonical height. Invent. Math. 105 (1991), no. 2, 347-373.
[136] J. H. Silverman, Counting integer and rational points on varieties. In Columbia University Number Theory Seminar, pp. 4, 223-236, Astérisque 228, Math. Soc. France, Paris, 1992.
[137] S. A. Stepanov, The number of points of a hyperelliptic curve over a prime field. Math. USSR, Izv. 3 (1969), no. 5, 1103-1114.
[138] D. Sullivan, The density at infinity of a discrete group of hyperbolic motions. Publ. Math. Inst. Hautes Études Sci. 50 (1979), 171-202.
[139] D. Sullivan, Entropy, Hausdorff measures old and new, and limit sets of geometrically finite Kleinian groups. Acta Math. 153 (1984), no. 3-4, 259-277.
[140] T. Tao, Expansion in finite simple groups of Lie type. Grad. Stud. Math. 164, American Mathematical Society, Providence, RI, 2015.
[141] P. A. Vojta, generalization of theorems of Faltings and Thue-Siegel-Roth-Wirsing. J. Amer. Math. Soc. 25 (1992).
[142] A. Weil, On the Riemann Hypothesis in function fields. Proc. Natl. Acad. Sci. USA 27 (1941), 345-347.
[143] B. Weisfeiler, Strong approximation for Zariski-dense subgroups of semisimple algebraic groups. Ann. of Math. (2) 120 (1984), no. 2, 271-315.
[144] J. P. Whang, Arithmetic of curves on moduli of local systems. Algebra Number Theory 14 (2020), no. 10, 2575-2605.
[145] J. P. Whang, Global geometry on moduli of local systems for surfaces with boundary. Compos. Math. 156 (2020), no. 8, 1517-1559.
[146] J. P. Whang, Nonlinear descent on moduli of local systems. Israel J. Math. 240 (2020), no. 2, 935-1004.
[147] D. Zagier, On the number of Markoff numbers below a given bound. Math. Comp. 39 (1982), no. 160, 709-723.

ALEXANDER GAMBURD

The Graduate Center, CUNY, New York, NY, USA, agamburd @ gmail.com

THE NUMBER OF RATIONAL POINTS ON A CURVE OF GENUS AT LEAST TWO

PHILIPP HABEGGER

ABSTRACT

The Mordell Conjecture states that a smooth projective curve of genus at least 2 defined over number field F F FFF admits only finitely many F F FFF-rational points. It was proved by Faltings in the 1980s and again using a different strategy by Vojta. Despite there being two different proofs of the Mordell Conjecture, many important questions regarding the set of F F FFF-rational points remain open. This survey concerns recent developments towards upper bounds on the number of rational points in connection with a question of Mazur.

MATHEMATICS SUBJECT CLASSIFICATION 2020

Primary 14G05; Secondary 11G30, 11G50, 14G25, 14K15

KEYWORDS

Mordell conjecture, smooth projective curves, rational points, Néron-Tate height, Weil height, family of abelian varieties

1. INTRODUCTION

Mordell's Conjecture asserts the finiteness of the set of rational solutions
{ ( x , y ) Q 2 : P ( x , y ) = 0 } ( x , y ) ∈ Q 2 : P ( x , y ) = 0 {(x,y)inQ^(2):P(x,y)=0}\left\{(x, y) \in \mathbb{Q}^{2}: P(x, y)=0\right\}{(x,y)∈Q2:P(x,y)=0}
for certain bivariate polynomials P Q [ X , Y ] P ∈ Q [ X , Y ] P inQ[X,Y]P \in \mathbb{Q}[X, Y]P∈Q[X,Y].
To make the statement and results precise, we will adopt the language of projective algebraic curves. Indeed, for the study of the zero set, we may assume that P P PPP is irreducible, even as a polynomial in C [ X , Y ] C [ X , Y ] C[X,Y]\mathbb{C}[X, Y]C[X,Y]. Moreover, its homogenization defines a projective curve in the projective plane. The classical procedure of normalization allows us to resolve any singularities. The result is an irreducible smooth projective curve defined over Q Q Q\mathbb{Q}Q. Its complex points define a compact Riemann surface of genus g { 0 , 1 , 2 , } g ∈ { 0 , 1 , 2 , … } g in{0,1,2,dots}g \in\{0,1,2, \ldots\}g∈{0,1,2,…}.
Conversely, let us assume we are presented with a smooth projective curve C C CCC defined over Q Q Q\mathbb{Q}Q that is irreducible as a curve taken over C C C\mathbb{C}C. The genus g g ggg of C ( C ) C ( C ) C(C)C(\mathbb{C})C(C) taken as a Riemann surface has important consequences for arithmetic questions on C ( Q ) C ( Q ) C(Q)C(\mathbb{Q})C(Q). Indeed, Mordell's Conjecture, proved by Faltings [25], states that # C ( Q ) # C ( Q ) #C(Q)\# C(\mathbb{Q})#C(Q) is finite if g 2 g ≥ 2 g >= 2g \geq 2g≥2.
We begin by formulating the Mordell Conjecture in slightly higher generality. We then discuss the history of results towards this conjecture. Finally, we give an overview of the proof of a joint work by Ziyang Gao, Vesselin Dimitrov, and the author towards a question of Mazur regarding upper bounds for the cardinality #C ( Q ) ( Q ) (Q)(\mathbb{Q})(Q). The upper bound will depend on the genus g g ggg and the Mordell-Weil rank of the Jacobian of C C CCC. For a special case of this result that does not make reference to Jacobians, we refer to Section 6.

1.1. The Mordell Conjecture

We begin by recalling Faltings's Theorem [25], a finiteness statement originally conjectured by Mordell [48]. By a curve we mean a geometrically irreducible projective variety of dimension 1. Throughout, we let F F FFF denote a number field and F ¯ F ¯ bar(F)\bar{F}F¯ a fixed algebraic closure of F F FFF.
Theorem 1.1 (Faltings [25]). Let C C CCC be a smooth curve of genus at least 2 defined over a number field F F FFF. Then C ( F ) C ( F ) C(F)C(F)C(F) is finite.
If the genus of C C CCC is small, then one cannot expect finiteness. Indeed, the set C ( F ) C ( F ) C(F)C(F)C(F) is nonempty after replacing F F FFF by a suitable finite extension. If C C CCC has genus 0 , then C C CCC is isomorphic to the projective line and thus C ( F ) C ( F ) C(F)C(F)C(F) is infinite. If C C CCC has genus 1, then C C CCC together with a point in C ( F ) C ( F ) C(F)C(F)C(F) is an elliptic curve. In particular, we obtain an algebraic group. After possibly extending F F FFF again, we may assume that C ( F ) C ( F ) C(F)C(F)C(F) contains a point of infinite order. So C ( F ) C ( F ) C(F)C(F)C(F) is infinite.
To prove the Mordell Conjecture, Faltings first proved the Shafarevich Conjecture for abelian varieties. At the time, the latter was known to imply the Mordell Conjecture thanks to a construction of Kodaira-Parshin.
Later, Vojta [62] gave a different proof of the Mordell Conjecture that is rooted in diophantine approximation. Bombieri [8] then simplified Vojta's proof. We will recall Vojta's
approach for curves in Section 3. The technical heart is the Vojta inequality which we formulate below as Theorem 3.1.
Faltings generalized Vojta's proof of the Mordell Conjecture to cover subvarieties of any dimension of an abelian variety. Indeed, Faltings [26, 27] and Hindry [36] proved the Mordell-Lang Conjecture for subvarieties of abelian varieties. Let A A AAA be an abelian variety defined over F F FFF and suppose Γ Î“ Gamma\GammaΓ is a subgroup of A ( F ¯ ) A ( F ¯ ) A( bar(F))A(\bar{F})A(F¯). The division closure of Γ Î“ Gamma\GammaΓ is the subgroup
{ P A ( F ¯ ) : there exists an integer n 1 with n P Γ } { P ∈ A ( F ¯ )  : there exists an integer  n ≥ 1  with  n P ∈ Γ } {P in A( bar(F))" : there exists an integer "n >= 1" with "nP in Gamma}\{P \in A(\bar{F}) \text { : there exists an integer } n \geq 1 \text { with } n P \in \Gamma\}{P∈A(F¯) : there exists an integer n≥1 with nP∈Γ}
of A ( F ¯ ) A ( F ¯ ) A( bar(F))A(\bar{F})A(F¯). For example, the division closure of the trivial subgroup Γ = { 0 } Γ = { 0 } Gamma={0}\Gamma=\{0\}Γ={0} is the subgroup A tors A tors  A_("tors ")A_{\text {tors }}Ators  of all points of finite order of A ( F ¯ ) A ( F ¯ ) A( bar(F))A(\bar{F})A(F¯). The following theorem holds for all base fields of characteristic 0 .
Theorem 1.2 (Mordell-Lang conjecture, Faltings, Hindry). Let A A AAA be an abelian variety defined over F F FFF and let Γ Î“ Gamma\GammaΓ be the division closure of a finitely generated subgroup of A ( F ¯ ) A ( F ¯ ) A( bar(F))A(\bar{F})A(F¯). If V V VVV is an irreducible closed subvariety of A A AAA, then the Zariski closure of V ( F ¯ ) Γ V ( F ¯ ) ∩ Γ V( bar(F))nn GammaV(\bar{F}) \cap \GammaV(F¯)∩Γ in V V VVV is a finite union of translates of algebraic subgroups of A A AAA.
The special case when Γ = A tors Γ = A tors  Gamma=A_("tors ")\Gamma=A_{\text {tors }}Γ=Ators  is called the Manin-Mumford Conjecture and was proved by Raynaud [53].
More recently, Lawrence and Venkatesh [41] gave yet another proof of the Mordell Conjecture. It was inspired by Faltings's original approach and the method of ChabautyKim. We refer to the survey [6] on these developments.
In this survey we concentrate mainly on the case of curves and comment on possible extensions to the higher dimensional case.

1.2. Some remarks on effectivity

Despite the variety of approaches to the Mordell Conjecture, no effective proof is known. For example, if the curve C C CCC is presented explicitly as the vanishing locus of homogeneous polynomial equations with rational coefficients, say, then in full generality we know no algorithm that produces the finite list of rational points of C C CCC. The question of effectivity is already open in genus 2 , for example, for the family Y 2 = X 5 + t Y 2 = X 5 + t Y^(2)=X^(5)+tY^{2}=X^{5}+tY2=X5+t parametrized by t t ttt. Proving an effective version of the Mordell Conjecture is among the most important outstanding problems in diophantine geometry.
Although no general algorithm that determines the set of rational points is currently known, it is sometimes possible to determine the set of rational points. For example, we refer to the Chabauty-Coleman method [ 13 , 15 ] [ 13 , 15 ] [13,15][13,15][13,15] which provides a clean upper bound for the number of rational points subject to a hypothesis on the Mordell-Weil rank of the Jacobian of C C CCC. In several applications, this bound equals a lower bound for the number of rational points coming from a list of known rational points. Moreover, aspects of Kim's generalization of the Chabauty method were used by Balakrishnan, Dogra, Müller, Tuitman, and Vonk [5] to compute all rational points of the split Cartan modular curve of level 13 which appears in relation to Serre's uniformity question. A different approach motivated by work of Dem-
janenko and the theory of unlikely intersections was developed in a program by Checcoli, Veneziano, Viada [14]. Here too a condition on the rank of the curve's Jacobian is required for the method to apply. An remarkable aspect to this approach is that the authors obtain an explicit upper bound for the height of a rational point.

1.3. The number of rational points: conjectures and results

Given C C CCC and F F FFF as in Theorem 1.1, which invariants of C C CCC need to appear in an upper bound for #C(F)?
Example 1.3. (i) Consider the hyperelliptic curve C C CCC presented by
y 2 = ( x 1 ) ( x 2022 ) y 2 = ( x − 1 ) ⋯ ( x − 2022 ) y^(2)=(x-1)cdots(x-2022)y^{2}=(x-1) \cdots(x-2022)y2=(x−1)⋯(x−2022)
Its genus equals ( 2022 2 ) / 2 = 1010 ( 2022 − 2 ) / 2 = 1010 (2022-2)//2=1010(2022-2) / 2=1010(2022−2)/2=1010. Then C C CCC contains the rational points ( 1 , 0 ) , , ( 2022 , 0 ) ( 1 , 0 ) , … , ( 2022 , 0 ) (1,0),dots,(2022,0)(1,0), \ldots,(2022,0)(1,0),…,(2022,0). Together with the two points at infinity, we obtain at least 2024 rational points. This example easily generalizes to higher genus. For any g 2 g ≥ 2 g >= 2g \geq 2g≥2 and square-free f Q [ X ] f ∈ Q [ X ] f inQ[X]f \in \mathbb{Q}[X]f∈Q[X] of degree 2 g + 2 2 g + 2 2g+22 g+22g+2, the equation y 2 = f ( x ) y 2 = f ( x ) y^(2)=f(x)y^{2}=f(x)y2=f(x) determines a hyperelliptic curve C C CCC of genus g g ggg. If f f fff splits into (pairwise distinct) linear factors over Q Q Q\mathbb{Q}Q, then # C ( Q ) 2 g + 4 # C ( Q ) ≥ 2 g + 4 #C(Q) >= 2g+4\# C(\mathbb{Q}) \geq 2 g+4#C(Q)≥2g+4. So any upper bound for # C ( Q ) # C ( Q ) #C(Q)\# C(\mathbb{Q})#C(Q) must depend on the genus.
This lower bound is far from the truth. Stoll discovered a genus 2 curve defined over Q Q Q\mathbb{Q}Q with 642 rational points in a family of such curves constructed by Elkies. Mestre showed that for all g 2 g ≥ 2 g >= 2g \geq 2g≥2 there is a smooth curve of genus g g ggg defined over Q Q Q\mathbb{Q}Q with at least 8 g + 16 8 g + 16 8g+168 g+168g+16 rational points.
(ii) Let us now fix the curve C C CCC and let the number field F F FFF vary. We take C C CCC as the genus 2 hyperelliptic curve presented by y 2 = x 5 + 1 y 2 = x 5 + 1 y^(2)=x^(5)+1y^{2}=x^{5}+1y2=x5+1. Consider an integer n 0 n ≥ 0 n >= 0n \geq 0n≥0 and the points { ( m , ± ( m 5 + 1 ) 1 / 2 ) : m { 0 , , n } } m , ± m 5 + 1 1 / 2 : m ∈ { 0 , … , n } {(m,+-(m^(5)+1)^(1//2)):m in{0,dots,n}}\left\{\left(m, \pm\left(m^{5}+1\right)^{1 / 2}\right): m \in\{0, \ldots, n\}\right\}{(m,±(m5+1)1/2):m∈{0,…,n}}. So C ( F ) C ( F ) C(F)C(F)C(F) has at least 2 n + 2 + 1 2 n + 2 + 1 2n+2+12 n+2+12n+2+1 elements where F = Q ( ( m 5 + 1 ) 1 / 2 ) m { 1 , , n } F = Q m 5 + 1 1 / 2 m ∈ { 1 , … , n } F=Q((m^(5)+1)^(1//2))_(m in{1,dots,n})F=\mathbb{Q}\left(\left(m^{5}+1\right)^{1 / 2}\right)_{m \in\{1, \ldots, n\}}F=Q((m5+1)1/2)m∈{1,…,n}. Any upper bound C ( F ) C ( F ) C(F)C(F)C(F), even for C C CCC fixed, must depend on F F FFF.
Gabriel Dill pointed out that the number of F F FFF-points grows at least logarithmically in the degree [ F : Q ] [ F : Q ] [F:Q][F: \mathbb{Q}][F:Q] in this case. Indeed, [ F : Q ] 2 n [ F : Q ] ≤ 2 n [F:Q] <= 2^(n)[F: \mathbb{Q}] \leq 2^{n}[F:Q]≤2n, so #C F ( F ) 2 n F ( F ) ≥ 2 n ≥ F(F) >= 2n >=F(F) \geq 2 n \geqF(F)≥2n≥ 2 ( log [ F : Q ] ) / log 2 2 ( log ⁡ [ F : Q ] ) / log ⁡ 2 2(log[F:Q])//log 22(\log [F: \mathbb{Q}]) / \log 22(log⁡[F:Q])/log⁡2.
Let us consider the modular curve X 0 ( 37 ) X 0 ( 37 ) X_(0)(37)X_{0}(37)X0(37) which has genus 2 and is defined over Q Q Q\mathbb{Q}Q. Let p p ppp be one of the infinitely many prime numbers for which the Legendre symbol satisfies ( p / 37 ) = 1 ( − p / 37 ) = 1 (-p//37)=1(-p / 37)=1(−p/37)=1; so 37 splits in the quadratic field K = Q ( p ) K = Q ( − p ) K=Q(sqrt(-p))K=\mathbb{Q}(\sqrt{-p})K=Q(−p). Let F F FFF denote the Hilbert Class Field of K K KKK. There is an elliptic curve E E EEE defined over F F FFF with complex multiplication by the ring of integers of K K KKK. Moreover, E E EEE admits an isogeny of degree 37 to an elliptic curve defined over F F FFF. Thus X 0 X 0 X_(0)X_{0}X0 (37) has an F F FFF-rational point. The Galois group Gal ( F / K ) Gal ⁡ ( F / K ) Gal(F//K)\operatorname{Gal}(F / K)Gal⁡(F/K) acts on the F F FFF-rational points of X 0 ( 37 ) X 0 ( 37 ) X_(0)(37)X_{0}(37)X0(37). It is also known to act transitively on the moduli of elliptic curves with the same endomorphism ring as E E EEE. Thus # X 0 ( 37 ) ( F ) # X 0 ( 37 ) ( F ) #X_(0)(37)(F)\# X_{0}(37)(F)#X0(37)(F) is no less than [ F : K ] [ F : K ] [F:K][F: K][F:K] which equals the class number of K K KKK by Class Field Theory. So # X 0 ( 37 ) ( F ) [ F : K ] = [ F : Q ] / 2 # X 0 ( 37 ) ( F ) ≥ [ F : K ] = [ F : Q ] / 2 #X_(0)(37)(F) >= [F:K]=[F:Q]//2\# X_{0}(37)(F) \geq[F: K]=[F: \mathbb{Q}] / 2#X0(37)(F)≥[F:K]=[F:Q]/2. By the Landau-
Siegel Theorem, [ F : Q ] [ F : Q ] → ∞ [F:Q]rarr oo[F: \mathbb{Q}] \rightarrow \infty[F:Q]→∞ as p p → ∞ p rarr oop \rightarrow \inftyp→∞. In particular, any upper bound for # X 0 ( 37 ) ( F ) # X 0 ( 37 ) ( F ) #X_(0)(37)(F)\# X_{0}(37)(F)#X0(37)(F) must grow at least linearly in [ F : Q ] [ F : Q ] [F:Q][F: \mathbb{Q}][F:Q].
The Uniformity Conjecture by Caporaso-Harris-Mazur [11] predicts that the genus and base field of a curve are the only invariants required for a general upper bound.
Conjecture 1.4 (Caporaso-Harris-Mazur). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 be an integer and F F FFF a number field. There exists c ( g , F ) 1 c ( g , F ) ≥ 1 c(g,F) >= 1c(g, F) \geq 1c(g,F)≥1 such that if C C CCC is a smooth curve of genus g g ggg defined over F F FFF, then # C ( F ) c ( g , F ) # C ( F ) ≤ c ( g , F ) #C(F) <= c(g,F)\# C(F) \leq c(g, F)#C(F)≤c(g,F).
Caporaso, Harris, and Mazur showed that the Uniformity Conjecture follows from the Weak Lang Conjecture which is an extension of the Mordell Conjecture to higher dimension. It states that if V V VVV is a smooth projective variety defined over F F FFF of positive dimension and general type, then V ( F ) V ( F ) V(F)V(F)V(F) is not Zariski dense in V V VVV. Pacelli [50] showed that #C(F) is bounded from above in function of g g ggg and [ F : Q ] [ F : Q ] [F:Q][F: \mathbb{Q}][F:Q] under the Weak Lang Conjecture after Abramovich [1] treated the case of quadratic and cubic extensions earlier. A refined version of the Weak Lang Conjecture implies, again by work of Caporaso-Harris-Mazur, that #C(F) can be bounded from above in function of the genus, if we omit finitely many F F FFF-isomorphism classes of C C CCC, see also [12] for a correction. Rémond [56] has evidence towards this stronger version of the Uniformity Conjecture. Alpoge [2] showed that, on average, a smooth curve of genus 2 defined over Q Q Q\mathbb{Q}Q has a uniformly bounded number of rational points.
Mazur [46] posed the following question, which is a weaker version of the Uniformity Conjecture. We let Jac ( C ) Jac ⁡ ( C ) Jac(C)\operatorname{Jac}(C)Jac⁡(C) denote the Jacobian of a smooth curve C C CCC defined over a field. Then Jac ( C ) Jac ⁡ ( C ) Jac(C)\operatorname{Jac}(C)Jac⁡(C) is a principally polarized abelian variety whose dimension equals the genus of C C CCC. If the base field is a number field F F FFF, then Jac ( C ) ( F ) Jac ⁡ ( C ) ( F ) Jac(C)(F)\operatorname{Jac}(C)(F)Jac⁡(C)(F) is a finitely generated abelian group by the Mordell-Weil Theorem.
Question 1.5 (Mazur [46, P. 223]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 and r r rrr be integers and let F F FFF be a number field. There exists c ( g , r , F ) 1 c ( g , r , F ) ≥ 1 c(g,r,F) >= 1c(g, r, F) \geq 1c(g,r,F)≥1 such that if C C CCC is a smooth curve of genus g g ggg defined over F F FFF such that the rank of the Mordell-Weil group satisfies rk Jac ( C ) ( F ) r Jac ⁡ ( C ) ( F ) ≤ r Jac(C)(F) <= r\operatorname{Jac}(C)(F) \leq rJac⁡(C)(F)≤r, then # C ( F ) c ( g , r , F ) # C ( F ) ≤ c ( g , r , F ) #C(F) <= c(g,r,F)\# C(F) \leq c(g, r, F)#C(F)≤c(g,r,F).
Let us review some work on upper bounds for # C ( F ) # C ( F ) #C(F)\# C(F)#C(F). Parshin [59] showed how to extract an upper bound for the number of rational points from Faltings's theorem. In his original paper, Vojta [62] gave a blueprint on how to bound from above the number of rational points for a general C C CCC. This bound was refined by Bombieri [8] and de Diego [19]. However, these works did not provide an answer to Mazur's question.
To formulate de Diego's results we need some additional notation. We also require the Weil height on projective space and the Néron-Tate (or canonical) height, they are both defined in Section 2. Let S S SSS be an irreducible, smooth, quasiprojective variety defined over a number field F F FFF and presented with an immersion S P n S ⊆ P n S subeP^(n)S \subseteq \mathbb{P}^{n}S⊆Pn defined over F F FFF. De Diego's Theorem holds for a smooth family of curves parametrized by the base S S SSS. Indeed, let C S C → S Crarr S\mathscr{C} \rightarrow SC→S be a smooth morphism such that each fiber is a smooth curve of genus g 2 g ≥ 2 g >= 2g \geq 2g≥2. We write C s C s C_(s)\mathscr{C}_{s}Cs for the fiber of C S C → S Crarr S\mathscr{C} \rightarrow SC→S above s S ( F ¯ ) s ∈ S ( F ¯ ) s in S( bar(F))s \in S(\bar{F})s∈S(F¯). This is a smooth curve defined over F ¯ F ¯ bar(F)\bar{F}F¯. Let K s K s K_(s)\mathcal{K}_{s}Ks denote the canonical class on C S C S C_(S)\mathscr{C}_{S}CS, we identify it with a divisor class modulo linear equivalence of
degree 2 g 2 2 g − 2 2g-22 g-22g−2. If P C s ( F ¯ ) P ∈ C s ( F ¯ ) P inC_(s)( bar(F))P \in \mathscr{C}_{s}(\bar{F})P∈Cs(F¯), then ( 2 g 2 ) [ P ] K s ( 2 g − 2 ) [ P ] − K s (2g-2)[P]-K_(s)(2 g-2)[P]-\mathcal{K}_{s}(2g−2)[P]−Ks is well defined as a divisor class of degree 0 . So it represents a point in Jac ( C s ) Jac ⁡ C s Jac(C_(s))\operatorname{Jac}\left(\mathscr{C}_{s}\right)Jac⁡(Cs). In this way we obtain a morphism
j s : C s Jac ( C s ) given by P ( ( 2 g 2 ) [ P ] K s ) / j s : C s → Jac ⁡ C s  given by  P ↦ ( 2 g − 2 ) [ P ] − K s / ∼ j_(s):C_(s)rarr Jac(C_(s))quad" given by "P|->((2g-2)[P]-K_(s))//∼j_{s}: \mathscr{C}_{s} \rightarrow \operatorname{Jac}\left(\mathscr{C}_{s}\right) \quad \text { given by } P \mapsto\left((2 g-2)[P]-\mathcal{K}_{s}\right) / \simjs:Cs→Jac⁡(Cs) given by P↦((2g−2)[P]−Ks)/∼
Let θ s θ s theta_(s)\theta_{s}θs denote the theta divisor on Jac ( C s ) Jac ⁡ C s Jac(C_(s))\operatorname{Jac}\left(\mathscr{C}_{s}\right)Jac⁡(Cs) and h ^ s = h ^ C s , θ s h ^ s = h ^ C s , θ s hat(h)_(s)= hat(h)_(C_(s),theta_(s))\hat{h}_{s}=\hat{h}_{\mathscr{C}_{s}, \theta_{s}}h^s=h^Cs,θs the canonical height on Jac ( C s ) Jac ⁡ C s Jac(C_(s))\operatorname{Jac}\left(\mathscr{C}_{s}\right)Jac⁡(Cs) attached to this divisor.
Theorem 1.6 (de Diego [19]). There exists c ( C ) > 1 c ( C ) > 1 c(C) > 1c(\mathscr{C})>1c(C)>1 such that if F / F F ′ / F F^(')//FF^{\prime} / FF′/F is a finite extension and s S ( F ) s ∈ S F ′ s in S(F^('))s \in S\left(F^{\prime}\right)s∈S(F′), then
# { P C s ( F ) : h ^ s ( j s ( P ) ) c ( C ) ( 1 + h ( s ) ) } } 55 2 7 r k J a c ( C ) ( F ) # P ∈ C s F ′ : h ^ s j s ( P ) ≥ c ( C ) ( 1 + h ( s ) ) ≤ 55 2 ⋅ 7 r k J a c ( C ) F ′ {:#{P inC_(s)(F^(')): hat(h)_(s)(j_(s)(P)) >= c(C)(1+h(s))}} <= (55)/(2)*7^(rkJac(C)(F^(')))\left.\#\left\{P \in \mathscr{C}_{s}\left(F^{\prime}\right): \hat{h}_{s}\left(j_{s}(P)\right) \geq c(\mathscr{C})(1+h(s))\right\}\right\} \leq \frac{55}{2} \cdot 7^{\mathrm{rkJac}(C)\left(F^{\prime}\right)}#{P∈Cs(F′):h^s(js(P))≥c(C)(1+h(s))}}≤552⋅7rkJac(C)(F′)
Roughly speaking, this theorem tells us that the number of points of C s C s C_(s)\mathscr{C}_{s}Cs of sufficiently large canonical height is bounded as in Mazur's question. We will often call these points large points. It is striking that the constant 7 is admissible for all genera; a fact that already appeared in Bombieri's work [8]. For smooth curves of genus 2 defined over Q Q Q\mathbb{Q}Q with a marked Weierstrass point, Alpoge [2] improved 7 to 1.872 .
Observe that
(1.1) { P C s ( F ) : h ^ s ( j s ( P ) ) < c ( C ) ( 1 + h ( s ) ) } (1.1) P ∈ C s F ′ : h ^ s j s ( P ) < c ( C ) ( 1 + h ( s ) ) {:(1.1){P inC_(s)(F^(')): hat(h)_(s)(j_(s)(P)) < c(C)(1+h(s))}:}\begin{equation*} \left\{P \in \mathscr{C}_{s}\left(F^{\prime}\right): \hat{h}_{s}\left(j_{s}(P)\right)<c(\mathscr{C})(1+h(s))\right\} \tag{1.1} \end{equation*}(1.1){P∈Cs(F′):h^s(js(P))<c(C)(1+h(s))}
is finite by the Northcott property for height functions which we will review in Section 2. To obtain a positive answer to Mazur's question we need, roughly speaking, to get a similar bound as in Theorem 1.6 for the cardinality of (1.1). There are quantitative versions of Northcott's Theorem. Estimating the cardinality (1.1) with these does, however, introduce a dependence on h ( s ) h ( s ) h(s)h(s)h(s).
Work of David-Philippon [17] and Rémond [54] further clarified the other value c ( C ) c ( C ) c(C)c(\mathscr{C})c(C). Indeed, David and Philippon proved a lower bound for the canonical height that, when combined with Rémond's explicit version of the Vojta inequality, yields the next theorem. To state their result, we momentarily shift our focus from families of smooth curves and their Jacobians to a curve immersed in an abelian variety.
Theorem 1.7 (Rémond [17, P. 643]). Let A be a g g ggg-dimensional principally polarized abelian variety defined over F F FFF. Let Γ Î“ Gamma\GammaΓ be the division closure of a finitely generated subgroup of A ( F ¯ ) A ( F ¯ ) A( bar(F))A(\bar{F})A(F¯) of rank r r rrr and let C A C ⊆ A C sube AC \subseteq AC⊆A be a curve that is not smooth of genus 1 . Then C ( F ¯ ) Γ C ( F ¯ ) ∩ Γ C( bar(F))nn GammaC(\bar{F}) \cap \GammaC(F¯)∩Γ is finite of cardinality at most
( 2 34 h 0 ( A ) deg C ) g 20 ( r + 1 ) 2 34 h 0 ( A ) deg ⁡ C g 20 ( r + 1 ) (2^(34)h_(0)(A)deg C)^(g^(20)(r+1))\left(2^{34} h_{0}(A) \operatorname{deg} C\right)^{g^{20}(r+1)}(234h0(A)deg⁡C)g20(r+1)
Here deg C deg ⁡ C deg C\operatorname{deg} Cdeg⁡C is the degree of C C CCC with respect to the principal polarization. Moreover, h 0 ( A ) h 0 ( A ) h_(0)(A)h_{0}(A)h0(A) is a height of the abelian variety A A AAA whose definition involves classical theta functions and the degree [ F : Q ] [ F : Q ] [F:Q][F: \mathbb{Q}][F:Q]. Roughly speaking, h 0 ( A ) h 0 ( A ) h_(0)(A)h_{0}(A)h0(A) encodes a bound for the coefficients needed to reconstruct the abelian variety A A AAA. Mazur's question does not allow for a dependence on h 0 ( A ) h 0 ( A ) h_(0)(A)h_{0}(A)h0(A). The hypothesis that C C CCC is not smooth of genus 1 is natural and cannot be dropped in general. It is equivalent to stating that C C CCC is not a translate of an algebraic subgroup of A A AAA.
David and Philippon's contribution to Theorem 1.7 was their lower bound for the canonical height, see [17, THÉORÈME 1.4]. Rémond [54] made Vojta's approach (and an inequality of Mumford) completely explicit. David-Philippon and Rémond have a result for subvarieties of A A AAA of any dimension. In other words, they provide an explicit version of the Mordell-Lang Conjecture.
David and Philippon's approach to Mazur's question and its higher-dimensional counterparts is via a strong quantitative version of the Bogomolov Conjecture on points of small height. A suitable version is Conjecture 1.5 [18] where the lower bound for the canonical height grows linearly in the Faltings height. We refer to [18, THÉORĖME 1.11] regarding the connection to rational points and more generally the Mordell-Lang Conjecture.
David and Philippon were able to strengthen their height lower bound when A A AAA is a power of an elliptic curve. This provided more evidence towards a positive answer for Mazur's question. Here is a version of their result for curves; their general result holds for subvarieties of a power of an elliptic curve.
Theorem 1.8 (David and Philippon [18, THÉoRÈme 1.13]). Let E be an elliptic curve defined over F F FFF and let g 2 g ≥ 2 g >= 2g \geq 2g≥2 be an integer. Suppose Γ Î“ Gamma\GammaΓ is the division closure of a finitely generated subgroup of E g ( F ¯ ) E g ( F ¯ ) E^(g)( bar(F))E^{g}(\bar{F})Eg(F¯) of rank r r rrr. If C E g C ⊆ E g C subeE^(g)C \subseteq E^{g}C⊆Eg is a curve that is not smooth of genus 1 , then # C ( F ¯ ) Γ deg ( C ) 7 g 18 ( 1 + r ) # C ( F ¯ ) ∩ Γ ≤ deg ⁡ ( C ) 7 g 18 ( 1 + r ) #C( bar(F))nn Gamma <= deg(C)^(7g^(18)(1+r))\# C(\bar{F}) \cap \Gamma \leq \operatorname{deg}(C)^{7 g^{18}(1+r)}#C(F¯)∩Γ≤deg⁡(C)7g18(1+r).
Thanks to a specialization argument, David and Philippon extended the above result to include the case where F F FFF is an arbitrary field of characteristic 0 . David, Nakamaye, and Philippon [16] then proved the existence of a ( g 2 ) ( g − 2 ) (g-2)(g-2)(g−2)-dimensional family of curves of genus g g ggg for which Mazur's question has a positive answer.
We now very briefly turn to some cardinality estimates using the Chabauty-Coleman method, which is based on p p ppp-adic analysis. It can produce finiteness of C ( F ) C ( F ) C(F)C(F)C(F) with a clean cardinality estimate subject to a restriction on the rank of the Mordell-Weil group.
Theorem 1.9 (Coleman [15]). Suppose C C CCC is a smooth curve of genus g 2 g ≥ 2 g >= 2g \geq 2g≥2 defined over Q Q Q\mathbb{Q}Q with rkac ( C ) ( Q ) g 1 rkac ⁡ ( C ) ( Q ) ≤ g − 1 rkac(C)(Q) <= g-1\operatorname{rkac}(C)(\mathbb{Q}) \leq g-1rkac⁡(C)(Q)≤g−1. If p > 2 g p > 2 g p > 2gp>2 gp>2g is a prime number where C C CCC has good reduction C ~ C ~ tilde(C)\tilde{C}C~, then # C ( Q ) 2 g 2 + # C ~ ( F p ) # C ( Q ) ≤ 2 g − 2 + # C ~ F p #C(Q) <= 2g-2+# tilde(C)(F_(p))\# C(\mathbb{Q}) \leq 2 g-2+\# \tilde{C}\left(\mathbb{F}_{p}\right)#C(Q)≤2g−2+#C~(Fp).
In combination with the Hasse-Weil bound # C ~ ( F p ) p + 1 + 2 g p # C ~ F p ≤ p + 1 + 2 g p # tilde(C)(F_(p)) <= p+1+2gsqrtp\# \tilde{C}\left(\mathbb{F}_{p}\right) \leq p+1+2 g \sqrt{p}#C~(Fp)≤p+1+2gp, the estimate above yields a bound for # C ( Q ) # C ( Q ) #C(Q)\# C(\mathbb{Q})#C(Q) in terms of g g ggg and p p ppp alone. Observe that a dependence in the arithmetic of C C CCC appears through the prime p p ppp. Stoll was able to remove this dependence for hyperelliptic curves at the cost of a stronger restriction on the rank of the Mordell-Weil group.
Theorem 1.10 (Stoll [58]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 and d 1 d ≥ 1 d >= 1d \geq 1d≥1 be integers. There exists c ( g , d ) > 0 c ( g , d ) > 0 c(g,d) > 0c(g, d)>0c(g,d)>0 with the following property. Suppose C C CCC is a smooth hyperelliptic curve of genus g g ggg defined over F F FFF with [ F : Q ] d [ F : Q ] ≤ d [F:Q] <= d[F: \mathbb{Q}] \leq d[F:Q]≤d. If rk Jac ( C ) ( Q ) g 3 rk ⁡ Jac ⁡ ( C ) ( Q ) ≤ g − 3 rk Jac(C)(Q) <= g-3\operatorname{rk} \operatorname{Jac}(C)(\mathbb{Q}) \leq g-3rk⁡Jac⁡(C)(Q)≤g−3, then # C ( F ) c ( g , d ) # C ( F ) ≤ c ( g , d ) #C(F) <= c(g,d)\# C(F) \leq c(g, d)#C(F)≤c(g,d).
Later, Katz, Rabinoff, and Zureick-Brown dropped the hyperellipticity condition.
Theorem 1.11 (Katz, Rabinoff, and Zureick-Brown [38]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 and d 1 d ≥ 1 d >= 1d \geq 1d≥1 be integers. There exists c ( g , d ) > 0 c ( g , d ) > 0 c(g,d) > 0c(g, d)>0c(g,d)>0 with the following property. Suppose C C CCC is a smooth curve of genus g g ggg defined over F F FFF with [ F : Q ] d [ F : Q ] ≤ d [F:Q] <= d[F: \mathbb{Q}] \leq d[F:Q]≤d. If rk Jac ( C ) ( Q ) g 3 rk ⁡ Jac ⁡ ( C ) ( Q ) ≤ g − 3 rk Jac(C)(Q) <= g-3\operatorname{rk} \operatorname{Jac}(C)(\mathbb{Q}) \leq g-3rk⁡Jac⁡(C)(Q)≤g−3, then # C ( F ) c ( g , d ) # C ( F ) ≤ c ( g , d ) #C(F) <= c(g,d)\# C(F) \leq c(g, d)#C(F)≤c(g,d).
After this detour to the Chabauty-Coleman method, we return to Vojta's method. Vesselin Dimitrov, Ziyang Gao, and the author have recently proved a lower bound for the canonical height that can be used as a replacement for the lower bounds by David and Philippon [ 17 , 18 ] [ 17 , 18 ] [17,18][17,18][17,18] in the context of Mordell's Conjecture. We recall this height inequality in Section 4.2 below. Indeed, it led to a positive answer to a strengthening of Mazur's question. The following result is new already in genus 2 .
Theorem 1.12 (Dimitrov, Gao, and Habegger [24, theorem 1.1]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 and d 1 d ≥ 1 d >= 1d \geq 1d≥1 be integers, there exist c ( g , d ) > 1 c ′ ( g , d ) > 1 c^(')(g,d) > 1c^{\prime}(g, d)>1c′(g,d)>1 and c ( g , d ) > 1 c ( g , d ) > 1 c(g,d) > 1c(g, d)>1c(g,d)>1 with the following property. Suppose C C CCC is a smooth curve of genus g g ggg defined over a number field F F FFF such that [ F : Q ] d [ F : Q ] ≤ d [F:Q] <= d[F: \mathbb{Q}] \leq d[F:Q]≤d. Then
# C ( F ) c ( g , d ) c ( g , d ) r k Jac ( C ) ( F ) # C ( F ) ≤ c ′ ( g , d ) ⋅ c ( g , d ) r k Jac ⁡ ( C ) ( F ) #C(F) <= c^(')(g,d)*c(g,d)^(rkJac(C)(F))\# C(F) \leq c^{\prime}(g, d) \cdot c(g, d)^{\mathrm{rk} \operatorname{Jac}(C)(F)}#C(F)≤c′(g,d)⋅c(g,d)rkJac⁡(C)(F)
Regarding the Caporaso-Harris-Mazur Uniformity Conjecture, we ask
Question 1.13. Can the cardinality # C ( F ) # C ( F ) #C(F)\# C(F)#C(F) be bounded from above by a function that is polynomial in [ F : Q ] [ F : Q ] [F:Q][F: \mathbb{Q}][F:Q] and g g ggg ?
No one currently knows an algorithm that computes the rank of the Mordell-Weil group Jac ( C ) ( F ) Jac ⁡ ( C ) ( F ) Jac(C)(F)\operatorname{Jac}(C)(F)Jac⁡(C)(F). However, upper bounds for this rank follow, for example, from the OoeTop Theorem [49]. We discuss this in more depth in Section 6.
Our results also cover points on C C CCC that lie in the division closure of a finitely generated subgroup. Let Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ denote the algebraic closure of Q Q Q\mathbb{Q}Q in C C C\mathbb{C}C. The Jacobian Jac ( C ) ( C ) (C)(C)(C) of a smooth curve C C CCC of genus g g ggg defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ corresponds to a Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯-point of the coarse moduli space A g A g A_(g)\mathbb{A}_{g}Ag of g g ggg-dimensional principally polarized abelian varieties. We let [ Jac ( C ) ] [ Jac ⁡ ( C ) ] [Jac(C)][\operatorname{Jac}(C)][Jac⁡(C)] denote the point of A g ( Q ¯ ) A g ( Q ¯ ) A_(g)( bar(Q))\mathbb{A}_{g}(\overline{\mathbb{Q}})Ag(Q¯) corresponding to Jac ( C ) Jac ⁡ ( C ) Jac(C)\operatorname{Jac}(C)Jac⁡(C) with its canonical principal polarization.
For example, if g = 1 g = 1 g=1g=1g=1 then A g = A 1 A g = A 1 A_(g)=A^(1)\mathbb{A}_{g}=\mathbb{A}^{1}Ag=A1 is the affine line. If E E EEE is an elliptic curve defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯, then [ E ] [ E ] [E][E][E] is the j j jjj-invariant of E E EEE.
In general, A g A g A_(g)\mathbb{A}_{g}Ag is a quasiprojective variety of dimension g ( g + 1 ) / 2 g ( g + 1 ) / 2 g(g+1)//2g(g+1) / 2g(g+1)/2 defined over Q Q Q\mathbb{Q}Q. We may fix an immersion ι : A g P n ι : A g ↪ P n iota:A_(g)↪P^(n)\iota: \mathbb{A}_{g} \hookrightarrow \mathbb{P}^{n}ι:Ag↪Pn into projective space. Then the absolute logarithmic Weil height h h hhh, see Section 2 for a definition, pulls back to a function h ι : A g ( Q ¯ ) [ 0 , ) h ∘ ι : A g ( Q ¯ ) → [ 0 , ∞ ) h@iota:A_(g)( bar(Q))rarr[0,oo)h \circ \iota: \mathbb{A}_{g}(\overline{\mathbb{Q}}) \rightarrow[0, \infty)h∘ι:Ag(Q¯)→[0,∞).
If C C CCC is a smooth curve of genus g 1 g ≥ 1 g >= 1g \geq 1g≥1 defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ and if P 0 C ( Q ¯ ) P 0 ∈ C ( Q ¯ ) P_(0)in C( bar(Q))P_{0} \in C(\overline{\mathbb{Q}})P0∈C(Q¯), then a point P C ( Q ¯ ) P ∈ C ( Q ¯ ) P in C( bar(Q))P \in C(\overline{\mathbb{Q}})P∈C(Q¯) defines a divisor [ P ] [ P 0 ] [ P ] − P 0 [P]-[P_(0)][P]-\left[P_{0}\right][P]−[P0] of degree 0 . One obtains a closed immersion
C Jac ( C ) from P ( [ P ] [ P 0 ] ) / C ↪ Jac ⁡ ( C )  from  P ↦ [ P ] − P 0 / ∼ C↪Jac(C)quad" from "P|->([P]-[P_(0)])//∼C \hookrightarrow \operatorname{Jac}(C) \quad \text { from } P \mapsto\left([P]-\left[P_{0}\right]\right) / \simC↪Jac⁡(C) from P↦([P]−[P0])/∼
where ∼ ∼\sim∼ again denotes linear equivalence, induces a closed immersion. We will write C P 0 C − P 0 C-P_(0)C-P_{0}C−P0 for the image of C C CCC in Jac ( C ) Jac ⁡ ( C ) Jac(C)\operatorname{Jac}(C)Jac⁡(C).
Theorem 1.14 (Dimitrov, Gao, and Habegger [24, THEOREM 1.2]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 be an integer. There exist c ( g , ι ) > 1 , c ( g , ι ) > 0 c ( g , ι ) > 1 , c ′ ( g , ι ) > 0 c(g,iota) > 1,c^(')(g,iota) > 0c(g, \iota)>1, c^{\prime}(g, \iota)>0c(g,ι)>1,c′(g,ι)>0, and c ( g , ι ) > 0 c ′ ′ ( g , ι ) > 0 c^('')(g,iota) > 0c^{\prime \prime}(g, \iota)>0c′′(g,ι)>0 that depend on g g ggg and the immersion ι ι iota\iotaι with the following property. Suppose C C CCC is a smooth curve of genus g g ggg defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ and let
P 0 C ( Q ¯ ) P 0 ∈ C ( Q ¯ ) P_(0)in C( bar(Q))P_{0} \in C(\overline{\mathbb{Q}})P0∈C(Q¯). Let Γ Î“ Gamma\GammaΓ be the division closure of a finitely generated subgroup of Jac ( C ) ( Q ¯ ) Jac ⁡ ( C ) ( Q ¯ ) Jac(C)( bar(Q))\operatorname{Jac}(C)(\overline{\mathbb{Q}})Jac⁡(C)(Q¯) of rank r r rrr. If
h ( ι ( [ Jac ( C ) ] ) ) c ( g , ι ) then # ( C P 0 ) ( Q ¯ ) Γ c ( g , ι ) c ( g , ι ) r h ( ι ( [ Jac ⁡ ( C ) ] ) ) ≥ c ′ ′ ( g , ι )  then  # C − P 0 ( Q ¯ ) ∩ Γ ≤ c ′ ( g , ι ) c ( g , ι ) r h(iota([Jac(C)])) >= c^('')(g,iota)quad" then "#(C-P_(0))( bar(Q))nn Gamma <= c^(')(g,iota)c(g,iota)^(r)h(\iota([\operatorname{Jac}(C)])) \geq c^{\prime \prime}(g, \iota) \quad \text { then } \#\left(C-P_{0}\right)(\overline{\mathbb{Q}}) \cap \Gamma \leq c^{\prime}(g, \iota) c(g, \iota)^{r}h(ι([Jac⁡(C)]))≥c′′(g,ι) then #(C−P0)(Q¯)∩Γ≤c′(g,ι)c(g,ι)r
In particular, we may take Γ = Jac ( C ) tors Γ = Jac ⁡ ( C ) tors  Gamma=Jac(C)_("tors ")\Gamma=\operatorname{Jac}(C)_{\text {tors }}Γ=Jac⁡(C)tors  and r = 0 r = 0 r=0r=0r=0. Thus the theorem yields a uniform bound for the number of torsion points that lie on C P 0 C − P 0 C-P_(0)C-P_{0}C−P0 if the height of ι ( [ Jac ( C ) ] ) ι ( [ Jac ⁡ ( C ) ] ) iota([Jac(C)])\iota([\operatorname{Jac}(C)])ι([Jac⁡(C)]) is sufficiently large.
Suppose that C C CCC is defined over a number field F F FFF. Then [ Jac ( C ) ] [ Jac ⁡ ( C ) ] [Jac(C)][\operatorname{Jac}(C)][Jac⁡(C)] is an F F FFF-rational point of the moduli space A g A g A_(g)\mathbb{A}_{g}Ag. If we impose also h ( ι ( [ Jac ( C ) ] ) ) < c ( g , ι ) h ( ι ( [ Jac ⁡ ( C ) ] ) ) < c ′ ′ ( g , ι ) h(iota([Jac(C)])) < c^('')(g,iota)h(\iota([\operatorname{Jac}(C)]))<c^{\prime \prime}(g, \iota)h(ι([Jac⁡(C)]))<c′′(g,ι), then [ Jac ( C ) ] [ Jac ⁡ ( C ) ] [Jac(C)][\operatorname{Jac}(C)][Jac⁡(C)] lies in a finite set by the Northcott property. Thus Jac ( C ) Jac ⁡ ( C ) Jac(C)\operatorname{Jac}(C)Jac⁡(C) is in one of at most finitely many Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯-isomorphism classes and so is C C CCC by the Torelli Theorem.
Raynaud proved the following result which is the Manin-Mumford Conjecture for curves.
Theorem 1.15 (Raynaud [52]). Let C C CCC be smooth curve defined over C C C\mathbb{C}C of genus at least 2 . Then ( C P 0 ) Jac ( C ) tors C − P 0 ∩ Jac ⁡ ( C ) tors  (C-P_(0))nn Jac(C)_("tors ")\left(C-P_{0}\right) \cap \operatorname{Jac}(C)_{\text {tors }}(C−P0)∩Jac⁡(C)tors  is finite.
Theorem 1.14 gives evidence towards the Uniform Manin-Mumford Conjecture which states that ( C P 0 ) Jac ( C ) tors C − P 0 ∩ Jac ⁡ ( C ) tors  (C-P_(0))nn Jac(C)_("tors ")\left(C-P_{0}\right) \cap \operatorname{Jac}(C)_{\text {tors }}(C−P0)∩Jac⁡(C)tors  is bounded from above in terms of the genus g g ggg only for any smooth curve C C CCC of genus g 2 g ≥ 2 g >= 2g \geq 2g≥2 defined over any field in characteristic 0 .
Using a different approach involving equidistribution and motivated by dynamical systems, DeMarco, Krieger, and Ye [20] had made substantial progress towards the Uniform Manin-Mumford Conjecture. They proved it for smooth curves of genus 2 defined over C C C\mathbb{C}C that are double covers of an elliptic curve when the base point P 0 P 0 P_(0)P_{0}P0 is a Weierstrass point.
In a preprint, Kühne [39] complemented the method in [24] using ideas from equidistribution to prove the Uniform Manin-Mumford Conjecture.
Theorem 1.16 (Kühne [39]). Let g 2 g ≥ 2 g >= 2g \geq 2g≥2 be an integer. There exist c ( g ) > 1 c ( g ) > 1 c(g) > 1c(g)>1c(g)>1 and c ( g ) > 1 c ′ ( g ) > 1 c^(')(g) > 1c^{\prime}(g)>1c′(g)>1 that depend on g g ggg with the following property. Suppose C C CCC is a smooth curve of genus g g ggg defined over C C C\mathbb{C}C and let P 0 C ( C ) P 0 ∈ C ( C ) P_(0)in C(C)P_{0} \in C(\mathbb{C})P0∈C(C). Let Γ Î“ Gamma\GammaΓ be the division closure of a finitely generated subgroup of Jac ( C ) ( C ) Jac ⁡ ( C ) ( C ) Jac(C)(C)\operatorname{Jac}(C)(\mathbb{C})Jac⁡(C)(C) of rank r r rrr. Then # ( C P 0 ) ( C ) Γ c ( g ) c ( g ) r # C − P 0 ( C ) ∩ Γ ≤ c ′ ( g ) c ( g ) r #(C-P_(0))(C)nn Gamma <= c^(')(g)c(g)^(r)\#\left(C-P_{0}\right)(\mathbb{C}) \cap \Gamma \leq c^{\prime}(g) c(g)^{r}#(C−P0)(C)∩Γ≤c′(g)c(g)r.
In contrast to Theorem 1.14, Kühne is able to handle curves C C CCC defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ for which [ Jac ( C ) ] [ Jac ⁡ ( C ) ] [Jac(C)][\operatorname{Jac}(C)][Jac⁡(C)] has height less than c ( g , ι ) c ′ ′ ( g , ι ) c^('')(g,iota)c^{\prime \prime}(g, \iota)c′′(g,ι). Once uniformity is established for all curves over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯, Kühne is able to pass to the base field C C C\mathbb{C}C using a specialization argument laid out by Dimitrov, Gao, and the author [22] which relies on a result of Masser [43]. Kühne thus answers an older question of Mazur, see the top of page 234 [45], and obtains the full Mordell-Lang variant for curves.
DeMarco and Mavraki's [21] work on a relative version of the Bogomolov conjecture, see [72] and [22], motivated Kühne [39,40] to extend the reach of Arakelovian equidistribution methods of Szpiro-Ullmo-Zhang [60] and Yuan [65] to families of abelian varieties over a quasiprojective base. For algebraic curves, this settles the uniform Bogomolov and the uniform Manin-Mumford conjectures.
Yuan [66] recently gave another proof of Theorem 1.16. His method also runs via a uniform Bogomolov theorem and thus contains aspects related to height lower bounds. However, Yuan's approach relies on arithmetic bigness, rather than on equidistribution. It is independent of the approaches mentioned above and uses a new theory of adelic line bundles over quasiprojective varieties developed by Yuan and Zhang [67] which generalizes Zhang's theory [70] in the projective case. They derive a height inequality for a polarized dynamical system, see Theorem 1.3.2 and Section 6 [67], that extends our own bound. One aspect of Yuan's method is that it works for global fields in any characteristic.
We come to some questions regarding the base constant c ( g ) c ( g ) c(g)c(g)c(g) in the estimates above. In the context of Mordell's Conjecture, Bombieri observed that the number of large points is bounded by a multiple of 7 r k J a c ( C ) ( F ) 7 r k J a c ( C ) ( F ) 7^(rkJac)(C)(F)7^{\mathrm{rk} J a c}(C)(F)7rkJac(C)(F).
Question 1.17. Can the base 7 in the estimate for the number of large points as in Theorem 1.6 be replaced by a function in g g ggg that tends to 1 for g g → ∞ g rarr oog \rightarrow \inftyg→∞ ?
Alpoge [2] used the Kabatiansky-Levenshtein estimates on spherical codes to improve on the constant 7 in genus 2. It is quite possible that Alpoge's approach will shed light on this last question.
Concerning the constant c ( g ) c ( g ) c(g)c(g)c(g) in Theorem 1.16, we pose the following two questions which also cover the moderate, i.e., nonlarge, points. They were inspired by questions of Helfgott.
Question 1.18. Can we choose the c ( g ) c ( g ) c(g)c(g)c(g) in Theorem 1.16 such that there exists B 1 B ≥ 1 B >= 1B \geq 1B≥1 with c ( g ) B c ( g ) ≤ B c(g) <= Bc(g) \leq Bc(g)≤B for all integers g 2 g ≥ 2 g >= 2g \geq 2g≥2 ?
Question 1.19. Can we choose the c ( g ) c ( g ) c(g)c(g)c(g) in Theorem 1.16 with lim g c ( g ) = 1 lim g → ∞   c ( g ) = 1 lim_(g rarr oo)c(g)=1\lim _{g \rightarrow \infty} c(g)=1limg→∞c(g)=1 ?
Recently, Gao, Ge, and Kühne [32] completed the proof of the Uniform MordellLang Conjecture for a subvariety V V VVV of a polarized abelian variety A A AAA of any dimension. Uniformity here amounts to bounding the number of irreducible components of the Zariski closure in Theorem 1.2 from above by c ( dim A , deg V ) c ( dim A , deg V ) r c ′ ( dim ⁡ A , deg ⁡ V ) c ( dim ⁡ A , deg ⁡ V ) r c^(')(dim A,deg V)c(dim A,deg V)^(r)c^{\prime}(\operatorname{dim} A, \operatorname{deg} V) c(\operatorname{dim} A, \operatorname{deg} V)^{r}c′(dim⁡A,deg⁡V)c(dim⁡A,deg⁡V)r. Their result holds over all base fields in characteristic 0 .
We refer to the comprehensive survey by Gao [31] that gives an overview of these recent developments and how they are interlinked.
Here is a brief overview of this survey. In Section 2 we recall some fundamental properties of two height functions: the Weil and Néron-Tate heights. They play a central role in the proof of Theorem 1.12. Then in Section 3 we describe Vojta's approach to the Mordell Conjecture. Later we return to the two height functions and describe their interactions on a family of abelian varieties. This is done in Section 4. Here we also describe the Betti map, an important analytic tool. In Section 5 we sketch how all this fits together in the proof of Theorem 1.12. In the final section we discuss an estimate for the number of rational points on a hyperelliptic curve that does not make reference to Jacobians.

2. HEIGHTS

Height functions are at the heart of Vojta's proof of the Mordell Conjecture and subsequent results such as Theorem 1.12. We will review two flavors of heights. The first one is the absolute logarithmic Weil height which is defined on algebraic points of the projective space. One can also use it to define a class of height functions on a projective variety equipped with an invertible sheaf. The second height function is the canonical or Néron-Tate height on an abelian variety, also equipped with an invertible sheaf. The latter is compatible with the group structure on the abelian variety.

2.1. The absolute logarithmic Weil height

We review here briefly the main properties of the Weil height. For a thorough treatment, we refer to [9, CHAPTERS 1 AND 2] or [37, PART B].
We begin by defining the height of a rational point on projective space P n P n P^(n)\mathbb{P}^{n}Pn.
Definition 2.1. Let P P n ( Q ) P ∈ P n ( Q ) P inP^(n)(Q)P \in \mathbb{P}^{n}(\mathbb{Q})P∈Pn(Q). There exist projective coordinates ( x 0 , , x n ) x 0 , … , x n ∈ (x_(0),dots,x_(n))in\left(x_{0}, \ldots, x_{n}\right) \in(x0,…,xn)∈ Z n + 1 { 0 } Z n + 1 ∖ { 0 } Z^(n+1)\\{0}\mathbb{Z}^{n+1} \backslash\{0\}Zn+1∖{0} of P = [ x 0 : : x n ] P = x 0 : ⋯ : x n P=[x_(0):cdots:x_(n)]P=\left[x_{0}: \cdots: x_{n}\right]P=[x0:⋯:xn] with gcd ( x 0 , , x n ) = 1 gcd ⁡ x 0 , … , x n = 1 gcd(x_(0),dots,x_(n))=1\operatorname{gcd}\left(x_{0}, \ldots, x_{n}\right)=1gcd⁡(x0,…,xn)=1. Then we set
h ( P ) = log max { | x 0 | , , | x n | } h ( P ) = log ⁡ max x 0 , … , x n h(P)=log max{|x_(0)|,dots,|x_(n)|}h(P)=\log \max \left\{\left|x_{0}\right|, \ldots,\left|x_{n}\right|\right\}h(P)=log⁡max{|x0|,…,|xn|}
The vector ( x 0 , , x n ) x 0 , … , x n (x_(0),dots,x_(n))\left(x_{0}, \ldots, x_{n}\right)(x0,…,xn) is uniquely determined up to a sign, and so h ( P ) h ( P ) h(P)h(P)h(P) is well defined. For example, h ( [ 2 : 4 : 6 ] ) = h ( [ 1 : 2 : 3 ] ) = h ( [ 1 / 3 : 2 / 3 : 1 ] ) = log 3 h ( [ 2 : 4 : 6 ] ) = h ( [ 1 : 2 : 3 ] ) = h ( [ 1 / 3 : 2 / 3 : 1 ] ) = log ⁡ 3 h([2:4:6])=h([1:2:3])=h([1//3:2//3:1])=log 3h([2: 4: 6])=h([1: 2: 3])=h([1 / 3: 2 / 3: 1])=\log 3h([2:4:6])=h([1:2:3])=h([1/3:2/3:1])=log⁡3.
The following theorem is a straightforward consequence of the definition of the Weil height.
Theorem 2.2 (Northcott property). The set { P P n ( Q ) : h ( P ) B } P ∈ P n ( Q ) : h ( P ) ≤ B {P inP^(n)(Q):h(P) <= B}\left\{P \in \mathbb{P}^{n}(\mathbb{Q}): h(P) \leq B\right\}{P∈Pn(Q):h(P)≤B} is finite for all B B BBB.
Defining the height of an algebraic point in P n ( Q ¯ ) P n ( Q ¯ ) P^(n)( bar(Q))\mathbb{P}^{n}(\overline{\mathbb{Q}})Pn(Q¯) requires some basic algebraic number theory. Indeed, let K K KKK be a number field. We let M K M K M_(K)M_{K}MK denote the set of absolute values | | : K [ 0 , ) | â‹… | : K → [ 0 , ∞ ) |*|:K rarr[0,oo)|\cdot|: K \rightarrow[0, \infty)|â‹…|:K→[0,∞) that extend either the standard absolute value on Q Q Q\mathbb{Q}Q or a p p ppp-adic absolute value for some prime p p ppp. Then M K M K M_(K)M_{K}MK is called the set of places of K K KKK. For each v M K v ∈ M K v inM_(K)v \in M_{K}v∈MK, one sets d v = [ K v : Q w ] d v = K v : Q w d_(v)=[K_(v):Q_(w)]d_{v}=\left[K_{v}: \mathbb{Q}_{w}\right]dv=[Kv:Qw] where K v K v K_(v)K_{v}Kv is a completion of K K KKK with respect to v v vvv and Q w Q w Q_(w)\mathbb{Q}_{w}Qw is the completion of Q Q Q\mathbb{Q}Q in K v K v K_(v)K_{v}Kv with respect to w = v | Q w = v Q w=v|_(Q)w=\left.v\right|_{\mathbb{Q}}w=v|Q.
Definition 2.3. Let P P n ( Q ¯ ) P ∈ P n ( Q ¯ ) P inP^(n)( bar(Q))P \in \mathbb{P}^{n}(\overline{\mathbb{Q}})P∈Pn(Q¯) and let K K KKK be a number field such that P = [ x 0 : : x n ] P = x 0 : ⋯ : x n P=[x_(0):cdots:x_(n)]P=\left[x_{0}: \cdots: x_{n}\right]P=[x0:⋯:xn] where ( x 0 , , x n ) K n + 1 { 0 } x 0 , … , x n ∈ K n + 1 ∖ { 0 } (x_(0),dots,x_(n))inK^(n+1)\\{0}\left(x_{0}, \ldots, x_{n}\right) \in K^{n+1} \backslash\{0\}(x0,…,xn)∈Kn+1∖{0}. The absolute logarithmic Weil height, or just Weil height, is
(2.1) h ( P ) = 1 [ K : Q ] v M K d v log max { | x 0 | v , , | x n | v } (2.1) h ( P ) = 1 [ K : Q ] ∑ v ∈ M K   d v log ⁡ max x 0 v , … , x n v {:(2.1)h(P)=(1)/([K:Q])sum_(v inM_(K))d_(v)log max{|x_(0)|_(v),dots,|x_(n)|_(v)}:}\begin{equation*} h(P)=\frac{1}{[K: \mathbb{Q}]} \sum_{v \in M_{K}} d_{v} \log \max \left\{\left|x_{0}\right|_{v}, \ldots,\left|x_{n}\right|_{v}\right\} \tag{2.1} \end{equation*}(2.1)h(P)=1[K:Q]∑v∈MKdvlog⁡max{|x0|v,…,|xn|v}
The normalization constants d v d v d_(v)d_{v}dv are chosen such that the product formula
v M K | x | v d v = 1 ∏ v ∈ M K   | x | v d v = 1 prod_(v inM_(K))|x|_(v)^(d_(v))=1\prod_{v \in M_{K}}|x|_{v}^{d_{v}}=1∏v∈MK|x|vdv=1
holds for all x K { 0 } x ∈ K ∖ { 0 } x in K\\{0}x \in K \backslash\{0\}x∈K∖{0}. This guarantees that the right-hand side of (2.1) is independent of the choice of projective coordinates of P P PPP. In particular, we may assume that some projective coordinate of P P PPP equals 1. Thus h ( P ) 0 h ( P ) ≥ 0 h(P) >= 0h(P) \geq 0h(P)≥0 for all P P n ( Q ¯ ) P ∈ P n ( Q ¯ ) P inP^(n)( bar(Q))P \in \mathbb{P}^{n}(\overline{\mathbb{Q}})P∈Pn(Q¯). Moreover, h ( P ) h ( P ) h(P)h(P)h(P) is independent of the field K K KKK containing the projective coordinates.
For applications to diophantine geometry, it is useful to have a height function defined on algebraic points of an irreducible projective variety V V VVV defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯. But without additional data there is no reasonable way to define a height on V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯).
However, if V V VVV is a subvariety of the projective space P n P n P^(n)\mathbb{P}^{n}Pn, then we may restrict the Weil height h : P n ( Q ¯ ) R h : P n ( Q ¯ ) → R h:P^(n)( bar(Q))rarrRh: \mathbb{P}^{n}(\overline{\mathbb{Q}}) \rightarrow \mathbb{R}h:Pn(Q¯)→R to a function V ( Q ¯ ) R V ( Q ¯ ) → R V( bar(Q))rarrRV(\overline{\mathbb{Q}}) \rightarrow \mathbb{R}V(Q¯)→R. Slightly more generally, if V P n V → P n V rarrP^(n)V \rightarrow \mathbb{P}^{n}V→Pn is an immersion, then we may pull back the Weil height to V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯).
Recall that an immersion V P n V → P n V rarrP^(n)V \rightarrow \mathbb{P}^{n}V→Pn is induced by a tuple of ( n + 1 ) ( n + 1 ) (n+1)(n+1)(n+1) global sections of a very ample invertible sheaf on V V VVV. Conversely, given a very ample invertible sheaf L L L\mathscr{L}L on V V VVV, we can fix a basis of the vector space of global sections of L L L\mathscr{L}L and obtain an immersion L : V P n L : V → P n L:V rarrP^(n)\mathscr{L}: V \rightarrow \mathbb{P}^{n}L:V→Pn. So we obtain a function h L : V ( Q ¯ ) [ 0 , ) h ∘ L : V ( Q ¯ ) → [ 0 , ∞ ) h@L:V( bar(Q))rarr[0,oo)h \circ \mathscr{L}: V(\overline{\mathbb{Q}}) \rightarrow[0, \infty)h∘L:V(Q¯)→[0,∞). There is a wrinkle here, this function depends not only on ( V , L ) ( V , L ) (V,L)(V, \mathscr{L})(V,L) but also on the basis of the vector space of global sections. A different basis will lead to a function V ( Q ¯ ) [ 0 , ) V ( Q ¯ ) → [ 0 , ∞ ) V( bar(Q))rarr[0,oo)V(\overline{\mathbb{Q}}) \rightarrow[0, \infty)V(Q¯)→[0,∞) that differs from h L h ∘ L h@Lh \circ \mathscr{L}h∘L by a bounded function on V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯). We define h V , L h V , L h_(V,L)h_{V, \mathscr{L}}hV,L to be the equivalence class of functions V ( Q ¯ ) R V ( Q ¯ ) → R V( bar(Q))rarrRV(\overline{\mathbb{Q}}) \rightarrow \mathbb{R}V(Q¯)→R modulo bounded functions that contains h ι L h ∘ ι L h@iotaLh \circ \iota \mathscr{L}h∘ιL.
If L L L\mathscr{L}L is an ample invertible sheaf on V V VVV, then there exists an integer n 1 n ≥ 1 n >= 1n \geq 1n≥1 such that L n L ⊗ n L^(ox n)\mathscr{L}^{\otimes n}L⊗n is very ample. We then define h V , L = 1 n h V , L n h V , L = 1 n h V , L ⊗ n h_(V,L)=(1)/(n)h_(V,Lox n)h_{V, \mathscr{L}}=\frac{1}{n} h_{V, \mathscr{L} \otimes n}hV,L=1nhV,L⊗n; this is again only defined up to a bounded function on V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯). The equivalence class does not depend on the choice of n n nnn.
Finally, an arbitrary invertible sheaf L L L\mathscr{L}L in the Picard group Pic ( V ) Pic ⁡ ( V ) Pic(V)\operatorname{Pic}(V)Pic⁡(V) of V V VVV is of the form F M ( 1 ) F ⊗ M ⊗ ( − 1 ) FoxM^(ox(-1))\mathscr{F} \otimes \mathcal{M}^{\otimes(-1)}F⊗M⊗(−1) with F F F\mathcal{F}F and M M M\mathcal{M}M ample on V V VVV. The difference h V , F h V , M h V , F − h V , M h_(V,F)-h_(V,M)h_{V, \mathcal{F}}-h_{V, \mathcal{M}}hV,F−hV,M is well defined up to a bounded function on V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯). It does not depend on the pair F , M F , M F,M\mathcal{F}, \mathcal{M}F,M with difference L L L\mathscr{L}L, and we denote it by h V , L h V , L h_(V,L)h_{V, \mathscr{L}}hV,L. It is called the Weil height attached to ( V , L ) ( V , L ) (V,L)(V, \mathscr{L})(V,L).
Theorem 2.4. Let us keep the notation above. In particular, V V VVV is an irreducible projective variety defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯.
(i) The association L h V , L L ↦ h V , L L|->h_(V,L)\mathscr{L} \mapsto h_{V, \mathscr{L}}L↦hV,L is a group homomorphism with target the group of real-valued maps V ( Q ¯ ) R V ( Q ¯ ) → R V( bar(Q))rarrRV(\overline{\mathbb{Q}}) \rightarrow \mathbb{R}V(Q¯)→R modulo bounded functions.
(ii) For V V VVV equal to projective space and L L L\mathscr{L}L the hyperplane bundle O ( 1 ) O ( 1 ) O(1)\mathcal{O}(1)O(1), the Weil height from Definition 2.3 represents h P n , O ( 1 ) h P n , O ( 1 ) h_(P^(n),O(1))h_{\mathbb{P}^{n}, \mathcal{O}(1)}hPn,O(1).
(iii) Suppose W W WWW is a further irreducible projective variety defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯ and f f fff : W V W → V W rarr VW \rightarrow VW→V is a morphism. For all L Pic ( V ) L ∈ Pic ⁡ ( V ) Lin Pic(V)\mathscr{L} \in \operatorname{Pic}(V)L∈Pic⁡(V) we have h V , L f = h W , f L h V , L ∘ f = h W , f ∗ L h_(V,L)@f=h_(W,f)**Lh_{V, \mathscr{L}} \circ f=h_{W, f} * \mathscr{L}hV,L∘f=hW,f∗L. As usual, this equality is understood as an equality of equivalence classes of functions.
(iv) Suppose L Pic ( V ) L ∈ Pic ⁡ ( V ) Lin Pic(V)\mathscr{L} \in \operatorname{Pic}(V)L∈Pic⁡(V) admits a nonzero global section s s sss. Then h V , L h V , L h_(V,L)h_{V, \mathscr{L}}hV,L is bounded from below on the complement of the vanishing locus of s s sss. In particular, h V , L h V , L h_(V,L)h_{V, \mathscr{L}}hV,L is bounded from below on a Zariski open and dense subset of V V VVV.
Suppose that V V VVV is defined over a number field F Q ¯ F ⊆ Q ¯ F sube bar(Q)F \subseteq \overline{\mathbb{Q}}F⊆Q¯ and L Pic ( V ) L ∈ Pic ⁡ ( V ) Lin Pic(V)\mathscr{L} \in \operatorname{Pic}(V)L∈Pic⁡(V) is ample. Then the Northcott property holds for points of bounded degree, i.e.,
{ P V ( F ¯ ) : h V , L ( P ) B and [ F ( P ) : F ] D } P ∈ V ( F ¯ ) : h V , L ′ ( P ) ≤ B  and  [ F ( P ) : F ] ≤ D {P in V(( bar(F))):h_(V,L)^(')(P) <= B" and "[F(P):F] <= D}\left\{P \in V(\bar{F}): h_{V, \mathscr{L}}^{\prime}(P) \leq B \text { and }[F(P): F] \leq D\right\}{P∈V(F¯):hV,L′(P)≤B and [F(P):F]≤D}
is finite where h V , L h V , L ′ h_(V,L)^(')h_{V, \mathscr{L}}^{\prime}hV,L′ denotes any representative of h V , L h V , L h_(V,L)h_{V, \mathscr{L}}hV,L.
Let V V VVV be an irreducible projective variety defined over Q ¯ Q ¯ bar(Q)\overline{\mathbb{Q}}Q¯. We conclude this section by discussing a powerful tool to translate geometric information, here on intersection numbers, into an inequality of heights. The basic question is the following. Given invertible sheaves F F F\mathcal{F}F and M M M\mathcal{M}M on V V VVV, under what conditions can one bound h V , M h V , M h_(V,M)h_{V, \mathcal{M}}hV,M from above in terms of h V , F h V , F h_(V,F)h_{V, \mathcal{F}}hV,F ?
(i) We first consider the special case V = P n V = P n V=P^(n)V=\mathbb{P}^{n}V=Pn. As Pic ( P n ) Pic ⁡ P n Pic(P^(n))\operatorname{Pic}\left(\mathbb{P}^{n}\right)Pic⁡(Pn) is isomorphic to Z Z Z\mathbb{Z}Z, any Weil height is some integral multiple of h P n , O ( 1 ) h P n , O ( 1 ) h_(P^(n),O(1))h_{\mathbb{P}^{n}, \mathcal{O}(1)}hPn,O(1). So h V , F h V , F h_(V,F)h_{V, \mathcal{F}}hV,F and h V , M h V , M h_(V,M)h_{V, \mathcal{M}}hV,M are Z Z Z\mathbb{Z}Z-linearly dependent.
(ii) Let us again suppose that V V VVV is general and that F F F\mathscr{F}F is ample. Then there exists an integer k 1 k ≥ 1 k >= 1k \geq 1k≥1 such that F k M ( 1 ) F ⊗ k ⊗ M ⊗ ( − 1 ) Fox k oxM^(ox(-1))\mathcal{F} \otimes k \otimes \mathcal{M}^{\otimes(-1)}F⊗k⊗M⊗(−1) is ample. So for some positive integer l 1 l ≥ 1 l >= 1l \geq 1l≥1 the power F k l M ( l ) F ⊗ k l ⊗ M ⊗ ( − l ) F^(ox kl)oxM^(ox(-l))\mathscr{F}^{\otimes k l} \otimes \mathcal{M}^{\otimes(-l)}F⊗kl⊗M⊗(−l) is very ample. In particular, it admits a global section that does not vanish at a prescribed point of V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯). Theorem 2.4, parts (i) and (iv), imply
k l h V , F l h V , M = h V , F k l M ( l ) 0 k l h V , F − l h V , M = h V , F ⊗ k l ⊗ M ⊗ ( − l ) ≥ 0 klh_(V,F)-lh_(V,M)=h_(V,Fox kl oxMox(-l)) >= 0k l h_{V, \mathcal{F}}-l h_{V, \mathcal{M}}=h_{V, \mathcal{F} \otimes k l \otimes \mathcal{M} \otimes(-l)} \geq 0klhV,F−lhV,M=hV,F⊗kl⊗M⊗(−l)≥0
this must be parsed as an inequality between functions on V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯) defined up to addition of a bounded function. We conclude
(2.2) h V , M k h V , F (2.2) h V , M ≤ k h V , F {:(2.2)h_(V,M) <= kh_(V,F):}\begin{equation*} h_{V, \mathcal{M}} \leq k h_{V, \mathscr{F}} \tag{2.2} \end{equation*}(2.2)hV,M≤khV,F
(iii) For some applications, such as Theorem 1.12, the ampleness hypothesis on F F F\mathcal{F}F in (i) is not flexible enough. Moreover, we would like some way to estimate the factor k k kkk in (2.2) from above. We now describe a criterion of Siu that provides a solution to these two issues.
An invertible sheaf L Pic ( V ) L ∈ Pic ⁡ ( V ) Lin Pic(V)\mathscr{L} \in \operatorname{Pic}(V)L∈Pic⁡(V) is called big if
lim inf k dim H 0 ( V , L k ) k dim V > 0 lim inf k → ∞   dim ⁡ H 0 V , L ⊗ k k dim ⁡ V > 0 l i m   i n f_(k rarr oo)(dim H^(0)(V,L^(ox k)))/(k^(dim V)) > 0\liminf _{k \rightarrow \infty} \frac{\operatorname{dim} H^{0}\left(V, \mathscr{L}^{\otimes k}\right)}{k^{\operatorname{dim} V}}>0lim infk→∞dim⁡H0(V,L⊗k)kdim⁡V>0
here H 0 ( V , L ) H 0 ( V , L ) H^(0)(V,L)H^{0}(V, \mathscr{L})H0(V,L) denotes the vector space of global sections of L L L\mathscr{L}L.
If L L L\mathscr{L}L is a big invertible sheaf, then L k L ⊗ k L^(ox k)\mathscr{L}^{\otimes k}L⊗k has a nonzero global section for some k 1 k ≥ 1 k >= 1k \geq 1k≥1. Then using (i) and (iv) of Theorem 2.4 we see that h V , L = 1 k h V , L k h V , L = 1 k h V , L ⊗ k h_(V,L)=(1)/(k)h_(V,Lox k)h_{V, \mathscr{L}}=\frac{1}{k} h_{V, \mathscr{L} \otimes k}hV,L=1khV,L⊗k is bounded from below on a Zariski open and dense subset of V V VVV.
For example, if L = F M ( 1 ) L = F ⊗ M ⊗ ( − 1 ) L=FoxM^(ox(-1))\mathscr{L}=\mathscr{F} \otimes \mathcal{M}^{\otimes(-1)}L=F⊗M⊗(−1) is big, then, again by Theorem 2.4(i), we find h V , F h V , M h V , F ≥ h V , M h_(V,F) >= h_(V,M)h_{V, \mathcal{F}} \geq h_{V, \mathcal{M}}hV,F≥hV,M on a Zariski open and dense subset of V V VVV.
We now come Siu's Criterion; it ensures that F M ( 1 ) F ⊗ M ⊗ ( − 1 ) FoxM^(ox(-1))\mathscr{F} \otimes \mathcal{M}^{\otimes(-1)}F⊗M⊗(−1) is big. An invertible sheaf L Pic ( V ) L ∈ Pic ⁡ ( V ) Lin Pic(V)\mathscr{L} \in \operatorname{Pic}(V)L∈Pic⁡(V) is called nef, or numerically effective, if ( L [ C ] ) 0 ( L ⋅ [ C ] ) ≥ 0 (L*[C]) >= 0(\mathscr{L} \cdot[C]) \geq 0(L⋅[C])≥0 for all irreducible curves C V C ⊆ V C sube VC \subseteq VC⊆V.
Siu's Criterion requires that F F F\mathscr{F}F and M M M\mathcal{M}M are both nef and that the intersection numbers on V V VVV satisfy ( F dim V ) > ( dim V ) ( F ( dim V 1 ) M ) ( F ⋅ dim ⁡ V ) > ( dim ⁡ V ) ( F ⋅ ( dim ⁡ V − 1 ) ⋅ M ) (F*dim V) > (dim V)(F*(dim V-1)*M)(\mathcal{F} \cdot \operatorname{dim} V)>(\operatorname{dim} V)(\mathcal{F} \cdot(\operatorname{dim} V-1) \cdot \mathcal{M})(F⋅dim⁡V)>(dim⁡V)(F⋅(dim⁡V−1)⋅M). With these hypotheses F M ( 1 ) F ⊗ M ⊗ ( − 1 ) FoxM^(ox(-1))\mathcal{F} \otimes \mathcal{M}^{\otimes(-1)}F⊗M⊗(−1) is big; see [42, THEOREM 2.2.15].
Say F F F\mathscr{F}F and M M M\mathcal{M}M are nef and ( F dim V ) > 0 ( F ⋅ dim ⁡ V ) > 0 (F*dim V) > 0(\mathscr{F} \cdot \operatorname{dim} V)>0(F⋅dim⁡V)>0. Let k k kkk and l l lll be positive integers with
( dim V ) ( F ( dim V 1 ) M ) ( F dim V ) < k l ( dim ⁡ V ) ( F ⋅ ( dim ⁡ V − 1 ) ⋅ M ) ( F ⋅ dim ⁡ V ) < k l (dim V)((F*(dim V-1)*M))/((F*dim V)) < (k)/(l)(\operatorname{dim} V) \frac{(\mathcal{F} \cdot(\operatorname{dim} V-1) \cdot \mathcal{M})}{(\mathscr{F} \cdot \operatorname{dim} V)}<\frac{k}{l}(dim⁡V)(F⋅(dim⁡V−1)⋅M)(F⋅dim⁡V)<kl
then F k M ( l ) F ⊗ k ⊗ M ⊗ ( − l ) F^(ox k)oxM^(ox(-l))\mathcal{F}^{\otimes k} \otimes \mathcal{M}^{\otimes(-l)}F⊗k⊗M⊗(−l) is big. So
h V , M | U k l h V , F | U h V , M U ≤ k l h V , F U h_(V,M)|_(U) <= (k)/(l)h_(V,F)|_(U)\left.h_{V, \mathcal{M}}\right|_{U} \leq\left.\frac{k}{l} h_{V, \mathcal{F}}\right|_{U}hV,M|U≤klhV,F|U
holds on some Zariski open and dense U V U ⊆ V U sube VU \subseteq VU⊆V.
This allows us to compare the heights h V , M h V , M h_(V,M)h_{V, \mathcal{M}}hV,M and h V , F h V , F h_(V,F)h_{V, \mathcal{F}}hV,F if we have information on the intersection numbers, at least on a rather large subset of V ( Q ¯ ) V ( Q ¯ ) V( bar(Q))V(\overline{\mathbb{Q}})V(Q¯).
Yuan [65] proved an arithmetic version of this criterion in his work on equidistribution. The author [34] used Siu's Criterion to study unlikely intersections in abelian varieties.

2.2. The canonical height on an abelian variety

Let F Q ¯ F ⊆ Q ¯ F sube bar(Q)F \subseteq \overline{\mathbb{Q}}F⊆Q¯ be a number field and A A AAA an abelian variety defined over F F FFF. If L L L\mathscr{L}